R is a free software environment for statistical computing and graphics. RStudio is a development interface for R featuring a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management.
MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.
The Digital Scholar Lab has cloud-based tools for automatically performing common computational queries on and creating visualizations from content sets built with Gale primary source collections.
Primary source collections include: American Fiction
17th and 18th Century Burney Collection, American Civil Liberties Union Papers, 1912-1990, American Fiction, Archives Unbound, Archives of Sexuality & Gender, British Library Newspapers, The Economist Historical Archive, Eighteenth Century Collections Online, Indigenous Peoples: North America, The Making of Modern Law, The Making of the Modern World, Nineteenth Century Collections Online, Nineteenth Century U.S. Newspapers, Sabin Americana, 1500-1926, The Times Digital Archive, The Times Literary Supplement Historical Archive, U.S. Declassified Documents Online
This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation.
Text Analysis with R for Students of Literature is written with students and scholars of literature in mind but will be applicable to other humanists and social scientists wishing to extend their methodological tool kit to include quantitative and computational approaches to the study of text.
Course materials from Teddy Roland's May 2016 D-Lab workshop.
For help with TDM access:
Send questions about text and data mining access to library resources to this shared email above, which brings together librarians and campus partners with subject, copyright, technical, and licensing expertise.
For help with text mining tools and software, check out the D-Lab.
Questions and suggestions related to this guide can go to Stacy Reardon.