Skip to Main Content

Digital Humanities

This guide is a sandbox for exploring digital humanities samples and tasks at UC Berkeley as supported by the libraries.

What is Data?

Part of the trick in defining "data" in regards to the humanities is that data can be just about anything. The books and letters we read are data as are the pictures we look sat and the videos we watch. We synthesize data for the essays and articles that we write. Those essays and articles are also data. One ends up with the question "what isn't data?"

It can--particularly for a digital environment--be more useful to think what kind of data one is working with rather than if something is data. There can be geographic data; metadata (e.g., data about data); publishing data; and onwards.

One should also think about what form the data takes. In particular, you should pay attention to if the data is 1) unstructured; 2) semi-structured; or 3) structured. The level of structure informs how to treat the data. If, for example, you are wanting for work in a digital environment with images, you are likely going to have to do quite a bit of structuring.

In contrast, if you are working with a good of letters, that data is likely to be semi-structured with information about the who, where, and when something is taking place. Furthermore, those letters are likely written and/or published in a standardized format with heading information; text; and signatures at the end.

To move that book of letters from semi-structured data to structured would then organize it--usually in a chart--into a machine readable format.

Data Lifecycle

Most of us continually work with data in different forms. The general steps that most people will go through during the course of a project include:

  1. Data creation/ collection 
  2. Data Management
    • Storage
    • Organization
  3. Data Structuring
  4. Processing / cleaning
    • Write up/ description 
  5. Preservation / Destruction
  6. [Publication]

That said, most people will go back and forth between the different steps as necessary. It is common, for example, for people to process their data and realize they need more - and head back to the data collection stage.