Skip to Main Content

Responsible Conduct of Research: Getting Started

Responsible Conduct of Research Resources and Help at UC Berkeley

This guide provides information and resources on data support and management for students and researchers participating in UC Berkeley's Responsible Conduct of Research (RCR) trainings. The Library Data Services and Research Data Management Programs provide a three part module that addresses proper data management and organization with the goal to increase rigor, transparency, and adhere to funder standards. The following boxes and tabs supplement information taught during the data management components of the training. If you have questions or comments about the content presented in this guide, please email librarydataservices@berkeley.edu.

Resources for Data Management Planning

DMP Tool: Build your Data Management Plan 

The DMPTool is an online system that can assist in a developing a data management plan by providing manuals, practice templates, along with giving information on how to meet specific funding agency requirements. Sign in to the tool using UC Berkeley credentials. 

Learn More on UC Berkeley Grant Life Cycle

File Naming Best Practices

Best practices refers to procedures that are seen as the most acceptable in a business or organizational system. By having continuous development and categorizing data using best practices, data files can be easier to identify, along with managing them to be distributed among others.

  • Ensure that your filenames have identifiers to aid in organizing and quickly accessing data. It can be helpful to include elements such as specific project names, dates, locations, and version numbers in creating a filename.

  • Tips:

    • Create a consistent file-name template for each type of data file, and record the template codes in a README file (see below)

    • Format dates as YYYYMMDD (four digit year, two digit month, two digit day)

    • Prevent long file names to maintain organization 

    • Do not use special characters such as ~ ! @ # $ % ^ & * ( ) ` ; < > ? , [ ] { } ' "

    • Avoid spaces by using underscores, dashes, or camelCase

    • Be careful with placement of periods because they designate file extensions and are used in Regular Expressions as wildcards. A period at the beginning of a file name indicates a configuration or hidden file. 

    • Create a master key using a spreadsheet template can help in naming files 

  • A few resources

How to Create and Manage README Files

  • README files include information describing a project and its resulting data. They enable the data to be understood and reused in the future. README files can be plaintext or human-readable Markdown (avoid MS Word).  
    • Important elements to consider including in a README file: 
      • Title of project and dataset

      • Name and contact information for PI and responsible researcher

      • File name template, elements, and codes

      • File formats

      • Variable names, units, etc.

      • Data processing: how final data were derived from raw data

      • Versioning: change log for documenting file versions

  • Store in top directory to which is applies

  • Free Downloadable Template courtesy of Cornell University

  • Make a README

Spreadsheet Best Practices

Following good spreadsheet practices is important to ensure data can be readily understood, analyzed, and reused. Data may not be exported or read correctly if spreadsheets do not follow these guidelines:

  1. Create a safety backup file before making changes 

  2. Avoid empty cells. If there is missing data or no data value in a specific cell, indicate this by entering a code such as -999 or -9999.

  3. Avoid empty columns and rows 

  4. Do not use special characters

  5. Steer clear of missing headers or headers in multiple places 

  6. Do not merge cells 

  7. Use entries in additional columns to convey information rather than colorful text or cell backgrounds 

  8. Avoid commas  

  9. Do not utilize embedded comments 

  10. Do not enter multiple data types in a single column 

OpenRefine (previously known as Google Refine) is a resource that assists in cleaning and transforming data to make it more consistent and analyzable. 

Learn more with the Stanford Libraries Data Best Practices Guide

Documentation

General list of resources for disciplinary metadata standards consisting of FAIR principles, writing README files, and file naming conventions:

Contact

Library Data Services Program

Research Data Management