"Text Mining is the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources... The difference between regular data mining and text mining is that in text mining the patterns are extracted from natural language text rather than from structured databases of facts."
- from What is Text Mining? by Marti Hearst
What is the difference between Text and Data Mining (TDM) and Generative AI?
| Feature | Text & Data Mining (TDM) | Generative AI (genAI) |
| Purpose | extract insights from existing data | creates new content |
| Output | analysis, summaries, visualizations | new text, images, audio, code |
| Data use | mining large text/data collections | training on large datasets to generate new content |
| Example | finding sentiment in news articles | writing a news article |
The Library offers a wealth of texts and data for your research. Use the navigation menu to browse by type of data.
To suggest or request data not listed on this guide, please email tdm-access at berkeley.edu.
Guide Books and Online Tutorials