Skip to main content

Natural History & Zoology: Data

Data Management Basics

NSF Data Management Plan Requirements: Proposals submitted on or after January 18, 2011, must include a supplementary document of no more than two pages labeled “Data Management Plan”.

Collaboration, accessibility and transparency are necessary for data management in modern science. Science.gov, NSF, NIH and other federal agencies mandate data plans with grant proposals.
Start with this list:
---- Make a plan to store your data
---- Find the right medium to store your data
---- Develop a system to organize your data
---- Make sure that your data has easy access
---- Make sure that your data is safe and secure.

Data Repositories: Integrative Biology & the Environment

Atmospheric Radiation Monitoring (ARM) Data Archive
preserves data collected through the operations and scientific field experiments of the ARM Climate Research Facility.

Carbon Dioxide Information Analysis Center (CDIAC)
the primary climate-change data and information analysis center of the U.S. Department of Energy (DOE). CDIAC's data includes records of the concentrations of carbon dioxide and other radiatively active gases in the atmosphere; the role of the terrestrial biosphere and the oceans in the biogeochemical cycles of greenhouse gases; emissions of carbon dioxide to the atmosphere; long-term climate trends; the effects of elevated carbon dioxide on vegetation; and the vulnerability of coastal areas to rising sea level.

Chesapeake Bay Environmental Observatory (CBEO)
available for registering datasets of different types, and searching for CBEO-registered data or for data registered in all projects within the GEON family of federated portals.

Computational and Information Systems Laboratory (CISL) Research Data Archive
contains meteorological and oceanographic observations, operational and reanalysis model outputs, and remote sensing datasets to support atmospheric and geosciences research, along with ancillary datasets, such as topography/bathymetry, vegetation, and land use.

Dryad
International repository of data underlying peer-reviewed articles in the basic and applied biosciences, governed by a consortium of journals a that collaboratively promote data archiving and ensure the sustainability of the repository.

Geo.Data.gov
Geographic information system (GIS) portal, also known as the Geospatial One-Stop (GOS), contains geospatial metadata records and links to live maps, features, catalog services, downloadable data sets, images, clearinghouses, map files, and more.

Global Biodiversity Information Facility (GBIF)
allows researchers to publish and discover biodiversity data—taxon primary occurrence data, taxonomic checklists and resource metadata—as part of a distributed global network.

Ecological Society of America's Ecological Archives
publishes materials supplemental to articles that appear in the ESA print journals (Ecology, Ecological Applications, and Ecological Monographs), and peer-reviewed Data Papers.

Knowledge Network for Biocomplexity (KNB)
national network designed to facilitate the discovery and analysis of distributed ecological and environmental datasets.

National Ecological Observatory Network (NEON)
continental-scale research platform for discovering and understanding the impacts of climate change, land-use change, and invasive species on ecology. It will consist of distributed sensor networks and experiments to record and archive ecological data for at least 30 years using standardized protocols and an open data policy.

Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC)
seeks to assemble, distribute, and archive data in terrestrial biogeochemistry and ecosystem dynamics of global environmental change.

Ocean Biogeographic Information System (OBIS)
Established by the Census of Marine Life (CoML). It is an evolving strategic alliance of people and organizations sharing a vision to make marine biogeographic data, from all over the world, freely available over the World Wide Web.

Paleobiology Database
provides global, collection-based occurrence and taxonomic data for marine and terrestrial animals and plants of any geological age, as well as web-based software for statistical analysis of the data.

PANGAEA® (Publishing Network for Geoscientific and Environmental Data)
Open Access library aimed at archiving, publishing and distributing data from earth system research. The system guarantees reference and long-term availability of its content through data set citations using international standard formats and persistent identifiers (DOI).

Smithsonian Tropical Research Institute's (STRI) Center for Tropical Forest Science (CTFS)
comprises a global network of large-scale and long-term studies that together monitor more than three million individual tropical trees, representing more than 6,000 tree species — nearly 10% of the world’s entire tropical tree flora.

TreeBASE
relational database of phylogenetic information hosted by the Yale Peabody Museum. TreeBASE stores phylogenetic trees and the data matrices used to generate them from published research papers. TreeBASE accepts all types of phylogenetic data (e.g., trees of species, trees of populations, trees of genes) representing all biotic taxa.

USA National Phenology Network (USA-NPN)
developing list of registered phenology data sets to make available to the research community and the general public.

VegBank
vegetation plot database of the Ecological Society of America's Panel on Vegetation Classification. Vegetation records, community types and plant taxa may be submitted to VegBank and may be subsequently searched, viewed, annotated, revised, interpreted, downloaded, and cited.

VertNet
global museum database of vertebrate natural history collections. vertebrate records are shared online through four distributed database networks organized by biological discipline: MaNIS (mammalogy), HerpNET (herpetology), ORNIS (ornithology) and FishNet (ichthyology).

World Data Center for Human Interactions in the Environment
archives and distributes global data sets related to population, sustainability, poverty, health, hazards, conservation, governance and climate. It is hosted by Columbia University Earth Institute's Center for International Earth Science Information Network (CIESIN).

Collaboration, file sharing and storage

Collaboration, file sharing and storage
bDrive: Berkeley instance of Google Drive; unlimited storage
Berkeley Box: 50 GB of file storage
bCourses Projects: File storage and collaborative projects/courses (2 GB)
Project Jupyter: Open source, interactive data science and scientific computing platform for research data documentation

Electronic Lab Notebooks (ELNs) with free trials (UC Berkeley does not recommend or discourage the use of any particular ELN):
Biovia
eCat
Elements
Evernote
LabArchives
Labguru

Data Services

Data repositories and more

Integrative Biology and the Environment - Data Repositories - selected of websites.

To find additional data repositories:
Open Access Directory of Data Repositories [Simmons University]
DataBib [Institute of Museum and Library Services]
Repositories [DataCite]

USEFUL sites:
Data Conservancy Organization - (NSF) collect, organize, & preserve data.
Many Eyes: data visualization tools from IBM.
Data Management & Publishing Checklist, MIT
NSF Division of Institution & Award Support

Others:
US Naval Observatory (USNO)  - Oceanography Portal: includes a range of astronomical data and products, & serves as the official source of time for the U.S. Department of Defense & a standard of time for the entire United States.

Data Management

Data Management Tools & Services

DMPTool: Step-by-step instructions for creating a ready-to-use data management plan (DMP) that meets the requirements of specific funding agencies

DASH: A simple self-service tool for researchers to use in publishing their datasets. 

EZID: Create persistent identifiers (DOIs and ARKs) for digital content. Berkeley researchers can contact data-consult@lists.berkeley.edu for a free account.

Data Management Guidelines

Data management and sharing: A portal for information on data requirements, management and sharing [Science Libraries @ UC Berkeley]

Data Management General Guidance: Guidelines for creating, organizing, managing, and sharing your data [CDL]

NSF data management plan requirements: An outline from the NSF Directorate for Biological Sciences

Preparing data management plans for NSF grant applications: UC Berkeley-specific tutorials and guidelines for NSF data management plans [Science Libraries @ UC Berkeley]

UC3, University of California Curation Center - UC/California Digital Library

UC Berkeley Data Services Management, IST.  See comparative table.

Data Management

Copyright © 2014-2016 The Regents of the University of California. All rights reserved. Except where otherwise noted, this work is subject to a Creative Commons Attribution-Noncommercial 4.0 License.