Skip to Main Content

Text Mining and AI Research Resources

What is Text and Data Mining (TDM)?

"Text Mining is the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources... The difference between regular data mining and text mining is that in text mining the patterns are extracted from natural language text rather than from structured databases of facts."

- from What is Text Mining? by Marti Hearst

What is the difference between Text and Data Mining (TDM) and Generative AI?

Feature Text & Data Mining (TDM) Generative AI (genAI)
Purpose extract insights from existing data creates new content
Output analysis, summaries, visualizations new text, images, audio, code
Data use mining large text/data collections training on large datasets to generate new content
Example finding sentiment in news articles writing a news article

The Library offers a wealth of texts and data for your research. Use the navigation menu to browse by type of data. 

To suggest or request data not listed on this guide, please email tdm-access at berkeley.edu.

Learn about TDM

Guide Books and Online Tutorials