UC Berkeley’s library buildings are open! Learn more.
"Text Mining is the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources... The difference between regular data mining and text mining is that in text mining the patterns are extracted from natural language text rather than from structured databases of facts."
- from What is Text Mining? by Marti Hearst (2003)
We've compiled sources for data, texts, and more available through the UC Berkeley Library and on the open web. Use the menu to navigate our listings by category. To suggest or request data not listed on this guide, please email firstname.lastname@example.org.
Help us keep library databases available for everyone. Before you programatically scrape a library database or website, check its terms of service, APIs, or contact the library for help. Using Python, Selenium, or other programmatic tools to scrape database search results, no matter how carefully, can result in access being shut down for the entire campus.
Complex Jobs: OCR Desktop