Skip to main content

Data Management: Save Data

Save your data safely

Determine your storage needs

… to find appropriate storage services

Three issues to consider

Permanence

Oversight

Security

How long do you need to store the data? 3 to 5 years? 10 years? Forever?

Will you destroy obsolete data?

Which versions of the data will you store as an official long-term copy?

Who will manage the data after project completion, over time, and across personnel changes? 

Are there ethical requirements for secure data storage (e.g., IRB, HIPAA)?

Do you need to discard or destroy any private, personal, confidential data after project completion?

 


Find a storage space

… that suits your needs for data control and sharing
 

Data Storage Option

 Examples

Pros (and Cons)

Personal computer

Internal or external drive

CDs and DVDs are not recommended

Personal control, but personal responsibility for theft, loss, and backups

Departmental or university servers

UC Berkeley's IST data centers and servers and data services  

Managed and may have automated backups

Institutional repository/archive

Merritt at UC

Long-term storage controlled by the hosting institution

Public database/repository

GenBank

Long-term storage with access for the general public

Cloud storage

Amazon S3
Google Drive at Berkeley
Box at Berkeley
Dropbox

Online accessible, but sensitive data may be vulnerable with third party services

 

How much storage space? 

  • Consider the growth rate of data and how frequently it changes. 

Tips

  • Uncompressed data are best, though it's okay to compress a third copy of the data.

  • Unencrypted data are best, though encryption is appropriate for sensitive data or a third copy.


Save in a file format for long-term access

... so your data can be opened and read in the future
 

Type of Document

Not ideal

Ideal

Text

MS Word

RTF or PDF

Spreadsheet

MS Excel

CSV

Image

GIF, JPG

TIFF

Sound

AAC (iTunes)

WAV

Video

Quicktime

MPEG-4

Databases

MS Access

XML or RDF

 

Here are more file format recommendations.  In general, use a file format with these features: 

  • Non-proprietary 
  • Unencrypted and uncompressed 
  • Open, documented standard (e.g., PDF, XML) 
  • Common usage by your research community 
  • Standard representation (e.g., ASCII text, Unicode) 

Look for discipline-specific standards for file formats. 


Descriptively name data files and folders

… so you'll find your data quickly

Avoid ambiguous file names like data1.csv.  Instead, use descriptive file names and be consistent like the following examples:

  • 75-celsius-trial_exp-group_original.csv
  • 75-celsius-trial_control_ver002.csv

Consider these terms when assigning file names:

  • project title
  • experimental conditions and group
  • trial numbers
  • file version number indicating data modifications
  • date or time stamps
  • author initials

Additionally, avoid ambiguous and unorganized file directories.  Instead, organize files in a descriptive folder structure like this example:

Project-title
 > Trial 1
    >> Experimental
    >> Control
 > Trial 2
 > Trial 3

Tips

  • Document your naming conventions for future reference.

  • Keep a log that describes the contents/purpose/use of your file as well as any changes made.

  • Try version control software.  Here’s a directory of such programs.

  • If you need to rename your files in bulk, try these free tools.


Secure your data

... so only you and your team have full access

  • Secure portable storage devices - like laptops and flash drives - from lost, theft, or damage

  • Consider the security of third party storage services and beware of inappropriate management - especially with sensitive or confidential data

  • Secure rooms with computer hardware and use laptop locks

  • Use authentication systems - like password protection for a computer 

  • Encrypt files that contain sensitive, confidential, or private data.  Search online for encryption tools.

  • Record passwords and encryption keys - but be sure to store them safely.

    • For example, record passwords on paper and lock in a file cabinet (2 copies).  Alternatively, use software like Password Safe.


Back up your data

… so you don't lose your hard work
 

Have 3 copies

Where to store?

1. Original "master" copy

On your primary computer or network

2. Local external storage

On an external hard drive in your lab or office

3. Remote external storage 
(i.e. a physically removed location)

UC Berkeley IST storage and backup services

bDrive and the bConnected suite at UC Berkeley

Cloud storage via third party companies.  Be mindful of security threats when storing private, confidential, and sensitive data on third-party services.

 

Tips

  • Check file recovery at setup and on a regular schedule

  • Check that older files are still readable and accessible.  If necessary, migrate older files to a format that offers long-term access.

Copyright © 2014-2016 The Regents of the University of California. All rights reserved. Except where otherwise noted, this work is subject to a Creative Commons Attribution-Noncommercial 4.0 License.