Skip to Main Content

Text Mining and AI Research Resources

AI and TDM with UC Berkeley Library Resources

This guide will help you understand which Library-licensed resources are available for text and data mining (TDM) or AI model training. Browse the Resources page to learn which resources are contractually friendly to TDM and AI. To learn more about TDM and how it is conducted, check out What is Text and Data Mining (TDM)

Before you scrape data and before you train AI models, please read the text below! 

Before You Scrape, Before You Train

Please view the general terms and conditions that you need to comply with for all Library electronic resources (journal articles, books, databases, and more) identified in the Library's conditions of use for electronic resources. 

In addition, before using any UC Berkeley Library-licensed content (journal articles, books, databases, and more) with AI tools or for text and data mining research, check what is allowed under our license agreements by looking at this guide. If you don't see your resource or database listed, then e-mail tdm-access at berkeley.edu. and we will tell you what is permitted.

Compliance is required

Violating license agreements can result in the entire campus losing access to critical research resources and potentially expose you and the University to legal liability.

Impacts on the University:

  • Loss of access: Publishers can immediately cut off access to critical research resources for everyone on campus
  • Legal liability: The University could face costly lawsuits. Some publishers might claim millions of dollars worth of damages
  • Damaged relationships: Violations can harm the library's ability to negotiate future agreements and prevent us from getting you access to critical content

Impacts on you:

  • Immediate suspension of your access to all library electronic resources
  • Legal exposure: you could potentially be held personally liable for damages in a lawsuit
  • Research disruption: Loss of access to essential materials for your work

Whether using UC Berkeley's own AI platforms (like Gemini or River), your personal generative AI account (like ChatGPT or Claude), or public websites, you still need to check on whether you can upload Library-licensed content to that platform.

Get Help

For text mining and AI questions: tdm-access@berkeley.edu

For other licensing questions: acq-licensing@lists.berkeley.edu

For copyright and fair use guidance: schol-comm@berkeley.edu

This guidance is for informational purposes and should not be construed as legal advice. When in doubt, always contact library staff for assistance with specific situations.