Genome Metadata


Genome Metadata Overview

Genome Metadata is descriptive data about single genomes. Genome metadata on PATRIC consists of 61 different metadata fields, called attributes, which are organized into the following seven broad categories: Organism Info, Isolate Info, Host Info, Sequence Info, Phenotype Info, Project Info, and Others.


Metadata Attributes

  • Organism Info Attributes: Genome Info Id, Genome Name, NCBI Taxon Id, Genome Status, Organism Name, Strain, Serovar, Biovar, Pathovar, Culture Collection, and Type Strain
  • Isolate Info Attributes: Isolation Site, Isolation Source, Isolation Comments, Collection Date, Isolation Country, Geographic Location, Latitude, Longitude, Altitude, and Depth.
  • Host Info Attributes: Host Name, Host Gender, Host Age, Host Health, Body Sample Site, and Body Sample Subsite.
  • Sequence Info Attributes: Sequencing Status, Sequencing Platform, Sequencing Depth, Assembly Method, Chromosomes, Plasmids, Contigs, Sequences, Genome Length, CG Content, RAST CDS, BRC CDS, and RefSeq CDS.
  • Phenotype Info Attributes: Gram Stain, Cell Shape, Motility, Sporulation, Temperature Range, Optimal Temperature, Salinity, Oxygen Requirement, Habitat, and Disease.
  • Project Info Attributes: Project Status, Availability, Sequencing Center, Completion Date, Publication, NCBI Project Id, RefSeq Project Id, Genbank Accessions, and RefSeq Accessions.
  • Others: Comments.

Accessing Genome Metadata on the PATRIC Website

  • Genome metadata can be accessed in multiple ways on the PATRIC website:
  • You may view a metadata table, at any taxonomic level, via the Genome List tab on any Organism Landing page.  To learn more about Organism Landing pages and the Genome List tab, see the PATRIC Data Organization Overview tutorial.
  • Selecting “Genome Metadata” from the “Searches and Tools” tab in the main navigation along the top of the PATRIC site will take you to the Genome List tab for all available bacteria data on PATRIC.  To learn how to sort, progressively filter, and download data within these tables, see Genome Finder FAQs.
  • The Overview tab on each Organism Landing page (at any taxonomic level) provides a key metadata attribute summary in the “Taxonomy Summary” table.  Clicking on “View all genomes and summary terms” at the bottom of this table will take you to the table provided under the Genome List tab.
    • Note: The Taxonomy Summary table does not show all attributes or attribute data available, however it shows the top 2 hits associated with each visible attribute.
  • Once you drill down to the Genome level within PATRIC, the Overview tab on each Genome Landing page provides a key metadata attribute summary in the “Genome Summary” table.  Clicking “more” at the bottom of this table expands the table to show all available metadata for that particular genome.
  • You may select parameters to conduct a site-wide search using PATRIC’s
 Genome Finder.  To learn how to search for and sort Genome Metadata, see Genome Finder FAQs.
  • To learn more about Organism and Genome Landing pages, see the PATRIC Data Organization Overview tutorial.

Genome Metadata Sources

PATRIC has collected metadata associated with bacterial genomes from multiple sources, such as NCBI’s BioProject database, GenBank records, and the Human Microbiome Project. Following the automated collection, the metadata was manually curated for consistency and accuracy, organized into a relational database, and integrated with the genomes available in PATRIC.

PATRIC’s collection and refinement of genome metadata is an ongoing task. Future plans include integrating the genome metadata with various analysis tools available on the PATRIC website. If you have access to metadata that is not currently available on PATRIC, please contact us and we’ll be happy to update the site.