Skip to Main Content

NCBI Bioinformatics Resources: An Introduction: BioProject, BioSample, SRA

BioProject, BioSample, SRA

BioProject home page

BioProject can be searched directly by keyword or field:

Find BioProjects by: Search text example(s)
Project data type "metagenome"[Project Data Type]
Publication information 19643200[PMID]
Material used "material transcriptome"[Properties]
Sample scope "scope environment"[Properties]
Species name Escherichia coli[organism]
Submitter organization, consortium, or center JGI[Submitter Organization]
Taxonomic Class Insecta[organism]
BioProject database identifier PRJNA33823[bioproject] or 33823[uid] or 33823[bioproject]

 

Example: To find the Bioproject record for the 2014 metagenomic survey of the New York City subway system:

  1. Enter "New York City" AND subway in the BioProject search box and click Search
  2. Note filters on the left-hand side to narrow a search if too many results are retrieved
  3. Select urban metagenome (accession: PRJNA271013)
  4. Note the information about the project contained in the record; in particular, the links to associated publications and related resources.
  5. In the right-hand menu, note the links to Related information in other NCBI databases

BioProject records link reciprocally to their constituent BioSample and SRA records:

  1. Scroll down to Project Data: this project comprises 1457 BioSample records and 1572 SRA experiments.

BioSample home

BioSample can be searched independently by keyword or using field tags and filters.

Types of BioSamples: https://submit.ncbi.nlm.nih.gov/biosample/template/

Or, BioSample records can be accessed from their associated BioProject:

Example: To find all the BioSample records from the 2014 metagenomic survey of the New York City subway system:

  1. From the BioProject "urban metagenome" record under Project Data, right-click on Other Datasets: BioSample: 1457 to see all of the BioSample records for this BioProject
  2. To see details for each sample, at top left change Summary to Full
  3. To download all the Biosample records, at top right click on Send to: File
  4. Select Format: Full (text) (or choose another format) and click Create File

If a particular sample or location is of interest use the geolocation information, or use the PathoMap Website linked from the BioProject record.

On the PathoMap Website, select both Subway Lines and Data Points under Reference in the right-hand menu, and the organism of interest in the left-hand menu. Clicking on the sample location on the map will provide information about the sample.

NCBI tools are redundant and interlinked -- you can get to the same information in multiple ways.

SRA can be searched independently, or SRA records associated with a specific BioProject or BioSample are linked from their respective records.

Each SRA record is given a unique accession number based on the source database (SRA, European Bioinformatics Institute (EBI), or DNA Data Bank of Japan (DDBJ)), and the type of record (Study, Sample, Experiment, Run):

  1. Study (e.g., the SRA record associated with a specific BioProject): SRP#, ERP#, or DRP#
  2. Sample (e.g.,the SRA record associated with a specific BioSample): SRS#, ERS#, or DRS#
  3. Experiment (e.g., the SRA record for a specific experiment or run(s)): SRX#, ERX#, or DRX#
  4. Run (e.g., the SRA record for a specific run): SRR#, ERR#, or DRR#

Example: To find all the SRA records and sequence data from the 2014 metagenomic survey of the New York City subway system:

  1. From the BioProject "urban metagenome" record under Project Data, right-click on Sequence Data: 1572. This will display the records for all 1572 SRA experiments for this BioProject.

To see the details for each experiment including the sequence data, click on its title.

Example: Shotgun sequencing of environmental sample: Sample P00189 (Accession SRX836091)

  1. Right-click on the title of the third record, Shotgun sequencing of environmental sample: Sample P00189
  2. Note the information about this experiment: the instrument used, links to the BioProject, BioSample, and SRA Study records, spot descriptor, and run information.
  3. In the Run table under Runs, right-click on the run accession number SRR1748784
  4. The Metadata tab of the Run browser shows information about the run.
  5. The Analysis tab shows a taxonomy of organisms identified from the sequence data.
  6. The Reads tab shows the sequence data.
  7. The Data Access tab provides links to access the sequence data.