Library Guides: Responsible Conduct of Research: Getting Started

Responsible Conduct of Research: Resources & Help

This guide provides information and resources on data support and management for students and researchers participating in UC Berkeley's Responsible Conduct of Research (RCR) trainings. The Library Data Services and Research Data Management Programs provide a three part module that addresses proper data management and organization with the goal to increase rigor, transparency, and adhere to funder standards. The following boxes and tabs supplement information taught during the data management components of the training. If you have questions or comments about the content presented in this guide, please email librarydataservices@berkeley.edu.

Data Management Planning Resources

DMP Tool: Build your Data Management Plan

The DMPTool is an online system that can assist in a developing a data management plan by providing manuals, practice templates, along with giving information on how to meet specific funding agency requirements. Sign in to the tool using UC Berkeley credentials.

Learn More on UC Berkeley Grant Life Cycle

National Institutes of Health (NIH) Sharing Policies and Related Guidance on NIH-Funded Research Resources and Final NIH Policy for Data Management and Sharing effective January 25, 2023
- Updated guidelines towards sharing scientific data sponsored by the National Institutes of Health (NIH) and acknowledges the importance of good data management practices with increase of data output.
- Learn more at National Institutes of Health (NIH): NIH-GEN DMSP (Forthcoming 2023)
National Science Foundation (NSF) Dissemination and Sharing of Research Results - NSF Data Management Plan Requirements
- Policies on distributing data results funded by the National Science Foundation (NSF) which includes guideline in creating a data management plan proposal.

File Naming Best Practices

Best practices refers to procedures that are seen as the most acceptable in a business or organizational system. By having continuous development and categorizing data using best practices, data files can be easier to identify, along with managing them to be distributed among others.

Ensure that your filenames have identifiers to aid in organizing and quickly accessing data. It can be helpful to include elements such as specific project names, dates, locations, and version numbers in creating a filename.
Tips:
- Create a consistent file-name template for each type of data file, and record the template codes in a README file (see below)
- Format dates as YYYYMMDD (four digit year, two digit month, two digit day)
- Prevent long file names to maintain organization
- Do not use special characters such as ~ ! @ # $ % ^ & * ( ) ` ; < > ? , [ ] { } ' "
- Avoid spaces by using underscores, dashes, or camelCase
- Be careful with placement of periods because they designate file extensions and are used in Regular Expressions as wildcards. A period at the beginning of a file name indicates a configuration or hidden file.
- Create a master key using a spreadsheet template can help in naming files
A few resources
- Stanford Libraries Data Best Practices provides additional information expanding upon filing data on a continuous basis.
- Cornell University's guidance on file formats for more information on preferred file formats for data

Spreadsheet Best Practices

Following good spreadsheet practices is important to ensure data can be readily understood, analyzed, and reused. Data may not be exported or read correctly if spreadsheets do not follow these guidelines:

Create a safety backup file before making changes
Avoid empty cells. If there is missing data or no data value in a specific cell, indicate this by entering a code such as -999 or -9999.
Avoid empty columns and rows
Do not use special characters
Steer clear of missing headers or headers in multiple places
Do not merge cells
Use entries in additional columns to convey information rather than colorful text or cell backgrounds
Avoid commas
Do not utilize embedded comments
Do not enter multiple data types in a single column

OpenRefine (previously known as Google Refine) is a resource that assists in cleaning and transforming data to make it more consistent and analyzable.

Learn more with the Stanford Libraries Data Best Practices Guide

README Files

README files include information describing a project and its resulting data. They enable the data to be understood and reused in the future. README files can be plaintext or human-readable Markdown (avoid MS Word).
- Important elements to consider including in a README file:
  - Title of project and dataset
  - Name and contact information for PI and responsible researcher
  - File name template, elements, and codes
  - File formats
  - Variable names, units, etc.
  - Data processing: how final data were derived from raw data
  - Versioning: change log for documenting file versions
Store in top directory to which is applies
Free Downloadable Template courtesy of Cornell University
Make a README

Image Integrity

Images should be minimally processed, and the original raw image files should be retained—as with numerical data, make any changes to versioned duplicate files.

Guidelines for best practices in image processing (U.S. Office of Research Integrity via Wayback Machine)
Questionable practices (U.S. Office of Research Integrity via Wayback Machine)

Be sure to check specific publisher and journal image submission requirements. Here are some general guidelines for researchers (Section A.1) from the STM Publishers Association:

Recommendations for handling image integrity issues

A quick guide to key image alteration best practices:

Tips for presenting scientific images with integrity
(U.S. Office of Research Integrity)

Documentation

General list of resources for disciplinary metadata standards consisting of FAIR principles, writing README files, and file naming conventions:

FAIR principles (making data Findable, Accessible, Interoperable, Readable)
Metadata standards, policies, and databases (from Fairsharing.org)
README template and examples (from Cornell University)
README guide and examples (from Harvard Medical School)
Disciplinary metadata standards (from Data Curation Centre)
File naming conventions (from Harvard Medical School)

Secondary menu

Responsible Conduct of Research

Contact