Skip to Main Content

Computational Precision Health: Research Data Management

Introduction to Data Management

Data management is the care and maintenance of the data that is produced during the course of research. It is an integral part of the research process and helps to ensure that data is properly organized, described, preserved, and shared. Data management is important because:

 

 Properly managed data and research outputs save time, money, and effort.

 

Data management is required by funding agencies and publishers.

 

Well-managed data aids in making your work reproducible.

Before Your Research


A data management plan is a formal document that outlines:

  • research workflow and information about the data that will be generated, collected, or reused
  • research output format, metadata, access and sharing policies, long-term storage, and budget

Creating a data management plan will save you time by creating a clear structure for organizing your data throughout the research life cycle, and ensures that you and others will be able to use and understand your data in the future.


Resource for getting started: use the DMPTool to write a data management plan that meets funder and institutional requirements.

During Your Research


Set up and document workflows to ensure that data and other research outputs are secure. This includes properly backing up, protecting, and archiving data.

 

Start by following the 3-2-1 rule:

  • store three copies of data at two different locations with one copy in the cloud (or offsite)
  • Some research data may also fall under restricted or confidential categories, and it is critical that proper policy compliance is both taken and recorded.

Resource for getting started: check out the active research data guidance grid to learn more about data types and storage options at UC Berkeley. 

After Your Research


Upon completion of a project, select an archival data repository to publish your research data outputs. Repositories ensure that your data will be stored and can be accessed for future use, either by you or other researchers. Publishers and funding institutions have guidelines to address data access and archiving through using trusted data repositories that ensure long term archiving and discoverability.

By properly archiving data and other outputs, research is more likely to be cited, reused, and discovered in search engines.


Resource for getting started: explore Dryad, the University of California's data publication service and repository.

Research Data Management Program at UC Berkeley

UC Berkeley's Research Data Management Program is available to consult before, during and after on your research on writing data management plans, encryption and security, metadata enrichment, data publishing and sharing, analysis and workflows, and more. Consultants provide individual and group consultations and training for researchers in all disciplines.

Get in touch with us on our website or via email at researchdata@berkeley.edu