Upload Transcriptomics Data to Workspace


Transcriptomics Data in the Private Workspace

At PATRIC, you can upload your own pre-processed transcriptomics datasets generated by microarray or RNA-seq technologies to your workspace and analyze them using annotations and analysis tools available at PATRIC.  Currently, PATRIC only supports differential gene expression data in the form of log ratio, generated by comparing to samples/conditions/time points.  You may also compare your data with other transcriptomics datasets available at PATRIC.  Data uploaded in your workspace is private and protected.


Using Uploaded Transcriptomics Data

  • Select one or more experiments and generate a dynamic gene list. Filter the gene list based on Log Ratio or Z-score cut-off, up/down regulation, or gene functions.  You may also save your gene selection as a group in your Workspace for future use.
  • Analyze the data using the Heatmap Viewer and clustering to quickly find genes that are similarly expressed across one or more comparisons.
  • Select a subset of genes and view corresponding metabolic pathways.
  • Compare your transcriptomics dataset with other published datasets available at PATRIC.

To learn more about what types of analysis tools are available, see Transcriptomics FAQs.


Uploading Transcriptomics Data into Workspace

First, click on “MY WORKSPACE” in the blue banner at the top of every PATRIC page.  Then, click on “Analyze Your Transcriptomics Experiments” in the left panel to upload your dataset.  You can also upload your data from the Transcriptomics Tab at any Taxon or Genome Level Landing page by clicking on    “Upload your transcriptomics data”.

Note:  You need to be a registered PATRIC user and logged in to your account to upload transcriptomics data.

When you click on a link to upload your transcriptomics data a small pop-up window appears to guide you through the data upload process.

  1. Specify Files:  Upload your transcriptomics data file, containing differential gene expression values in the form of log ratios, by specifying location on your computer or a web URL.   The file should be in one of the supported formats described below.  Optionally, you may also upload metadata related to sample comparisons in the prescribed format to help later in the data analysis.  Click to download Sample Data and Transcriptomics Templates.
  2. Map Gene Identifiers:  Once you have uploaded your data and metadata files, a brief summary of your data is presented to ensure that the data is parsed correctly.  You can specify the type of gene identifiers in your data file and map them to corresponding PATRIC identifiers.  Once mapping is done, the number of genes mapped is summarized.  Learn more about PATRIC identifiers in ID Mapping Tool FAQs and Annotation FAQs.

    Note:  Due to differences in annotations, some of the genes in your data may not map to corresponding PATRIC genes.  Unmapped genes are excluded from subsequent analysis.

  3. Describe Experiment:  Here, you may add more information about your dataset by providing experiment name (required), description, organism, and PubMed ID.
  4. Specify Group:  You may add your experiment to a new or existing group within your Workspace.  This allows you to group similar data sets together.  You can also place your dataset and other published datasets of interest available at PATRIC in the same group for comparison.

Supported Transcriptomics File Formats

Currently, PATRIC allows you to upload your transcriptomics datasets in the form of differential gene expression measured as log ratios.  Data can be uploaded in multiple file formats:  comma separated values (.csv), tab delimited values (.txt), or Excel (.xls or .xlsx). Click to download Sample Data and Transcriptomics Templates.  Files should contain data in one of the following formats:

1. Gene Matrix:

Data is presented in a matrix format with genes as rows and samples as columns.  There should be one row for each gene and one column for each sample.  Each cell in the matrix provides a differential expression value of a gene in the given comparison measured as log ratio (i.e. log2 (test/control)).  Below is an example of transcriptomics data in Gene Matrix format.

Gene ID Comparison 1 Comparison 2 Comparison 3
b0002

0.767

-1.316

-2.854

b0003

0.815

-1.841

-2.379

b0004

0.856

-1.643

-3.149

b0005

-1.014

1.511

0.393

b0006

0.296

-0.702

-1.985

2. Gene List:

Data is presented in three columns: Gene ID, Sample ID, and expression value. Expression value should be in the form of log ratio (i.e. log2 (test/control)).  Below is an example of transcriptomics data in Gene List format:

Gene ID Comparison ID Log Ratio
b0002 Comparison1

0.767

b0003 Comparison1

0.815

b0004 Comparison1

0.856

b0005 Comparison1

-1.014

b0006 Comparison1

0.296

b0002 Comparison2

-1.316

b0003 Comparison2

-1.841

b0004 Comparison2

-1.643

b0005 Comparison2

1.511

b0006 Comparison2

-0.702

b0002 Comparison3

-2.854

b0003 Comparison3

-2.379

b0004 Comparison3

-3.149

b0005 Comparison3

0.393

b0006 Comparison3

-1.985


Metadata File

PATRIC allows you to upload an additional file containing key metadata attributes about the comparisons.   This will enhance your analysis when using the Heatmap and clustering tools.  Below is a sample metadata table, which can be uploaded as a tab delimited or Excel file. Note:  The column names should be the same as in the example below.  Also, comparison IDs in the metadata file should match those in the data file.  Click to download Sample Data and Transcriptomics Templates.

Comparison ID

Title PubMed Accession Organism Strain Gene Modification Experiment Condition Time Point
Comparison1 rpoS exponential / WT exponential

17979199

GSE7885 Escherichia coli MG1655 rpoS mutant vs wild type
Comparison2 rpoS biofilm  / WT biofilm

17979199

GSE7885 Escherichia coli MG1655 rpoS mutant vs wild type
Comparison3 WT biofilm / WT exponential

17979199

GSE7885 Escherichia coli MG1655

The following is a brief description of metadata fields supported as of now:

  • Comparison ID:  Unique identifier for each comparison.
  • Title:  Brief title for the comparison showing two samples/conditions/time points being compared.
  • PubMed:  PubMed IDs for the publication describing the experiment and results, if available.  PubMed IDs will be linked to corresponding page at PubMed.
  • Accession:  If the dataset you are uploading is already available in a public repository, such as GEO, you may provide the accession number for the reference.
  • Organism:  Organism used in the experiment.
  • Strain:  Specific bacterial strain used in the comparison.
  • Gene Modification:  Gene name or locus tag of the gene that is manipulated (mutation, knockout, or over/under expression) in the comparison.
  • Experiment Condition:  Primary experimental variable being compared.  Examples include: mutant vs wild type, strain comparison, time point, growth phase, temperature, etc.

Is my data private and protected?

Yes, your data is private and protected in your workspace. Your data can be accessed only by logging into your account and is not available to anyone else.


Can I share my transcriptomics data with other PATRIC users?

As of now, you cannot share your data with other PATRIC users.  We plan to support data sharing in the near future.