Text Mining is the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources... The difference between regular data mining and text mining is that in text mining the patterns are extracted from natural language text rather than from structured databases of facts.
(From What is Text Mining? by Marti Hearst)
Start with the Library's Research Guide titled Text Mining & Computational Text Analysis! Here you will find pages on each of the following topics:
For more on what you can and can not do with TDM, check out the text data mining section of the Library's Office of Scholarly Communication Services Copyright Page.
You may also want to take a look at the Library's Spring 2023 Digital Publishing Series, with workshop offerings on TDM-related topics!
The site for all things DH on campus can be found at Digital Humanities at Berkeley, and includes information on the DH working group, a list-serv, and training sessions.
Digital humanities are an interdisciplinary set of fields that are primarily concerned with using digital technologies, sources and methods as part of research in the humanities. These fields are heavily involved with using electronic information and computational methods to investigate, analyse, synthesise and present research. Digital humanities also aim to explore how electronic media affects research in the discipline and likewise how humanities research contributes to computer studies. This is a rapidly emerging area of research, encompassing a wide range of methods and practices.
(From the University of Sydney)