Driving Biology Projects:


Driving Biology Projects (DBPs) are two-year awards that focus on infectious diseases research related to human bacterial pathogens. Research includes the use of high-throughput experimental technologies (HTP) to functionally characterize the genome, proteome or metabolome of bacterial organisms and/or host-pathogen interactions. This research may help elucidate the role of genes, proteins and metabolites with respect to pathogenesis, antimicrobial resistance or other biological processes of interest. The data and information generated by the DBPs will be made accessible to the broad scientific community and will be used to drive PATRIC’s infrastructure development to enable other researchers to perform similar analyses at the PATRIC website using their own data.


The DBP proposals were independently reviewed and evaluated by the PATRIC Scientific Working Group (SWG), and the top two proposals were selected for award for each funding period.  The awarded DBPs are:

Title PI Institution Pathogen Funding Period
Comparative transcriptome and proteome analysis of Clostridium difficile strains Yung-Fu Chang Cornell University, Ithaca, NY Clostridium difficile 2010 – 2012
Fitness Annotation of Bacterial Genomes Michael McClelland University of California, Irvine, CA Salmonella enterica enterica serovar Typhimurium 2010 – 2012
Integrated genomic approaches to elucidating novel virulence factors for bacteria complicating influenza Jon McCullers, Jason Rosch St. Jude Children’s Research Hospital, Memphis, TN Streptococcus pneumoniae, Staphylococcus aureus 2012 – 2014
Identification of Salmonella-specific Regulatory Networks Joseph Wade Wadsworth Center, Albany, NY Salmonella enterica enterica serovar Typhimurium 2012 – 2014


Comparative transcriptome and proteome analysis of Clostridium difficile strains

The overall goal of this project was to compare transcriptome, proteome and phenome of several divergent Clostridium difficile strains using high throughput technologies and bioinformatics tools, and to integrate the resultant data and tools into PATRIC system. The transcriptomics and proteomics analysis results helped to verify and complement the genome sequence annotations of available sequenced strains, provide insights into the role of core, divergent and strain specific genes in C. difficile pathogenesis, and expand the PATRIC system by inclusion of tools for the RNA-Seq analysis pipeline that can be used for other bacterial RNA-Seq projects. Using Phenotype microarrays, we have elucidated the comprehensive nutritional and chemical sensitivity profile of historic and newer hypervirulent strains of Clostridium difficile. Our results show that some of the newer outbreak associated strains have expanded metabolic potential which might explain one of the reasons behind the spread of C. difficile infections in the community.

  • RNA-Seq data:
  • Proteomics data:
  • Phenotype array data:
  • Publications:
    • Scaria J, Mao C, Chen JW, McDonough SP, Sobral B, Chang YF. 2013. Differential stress transcriptome landscape of historic and recently emerged hypervirulent strains of Clostridium difficile strains determined using RNA-Seq. PLoS One. PMID: 24244315.
    • Chen JW, Scaria J, Mao C, Sobral B, Zhang S, Lawley T, Chang YF. 2013. Proteomic comparison of historic and recently emerged hypervirulent Clostridium difficile strains. J Proteome Res. PMID: 23298230.

Fitness Annotation of Bacterial Genomes

The goal of this project was to provide PATRIC with the information needed to display which genes of non-typhoidal Salmonella serovar Typhimurium (STm) contribute to survival in a variety of environments. This was accomplished through a combination of high-throughput screening and sequencing methods and unique resources developed to annotate the STm genome with fitness information. STm transcriptomes were generated from bacteria growing in defined environments, including rich and minimal media, at stationary phase, and in conditions that induce virulence pathways. These transcriptomes provide basal reference profiles to standardize and improve analysis of the impending onslaught of high-throughput transcriptomics data. Decoration of the genome with basal transcriptional data complements the fitness profile and other existing annotations in this archetypical strain. Read more here.

  • Publications
    • Canals, R., X. Q. Xia, C. Fronick, S. W. Clifton, B. M. Ahmer, H. L. Andrews-Polymenis, S. Porwollik and M. McClelland (2012). High-throughput comparison of gene fitness among related bacteria. BMC Genomics. PMID: 22646920.

Integrated genomic approaches to elucidating novel virulence factors for bacteria complicating influenza

This project will produce the following information / assets in collaboration with PATRIC:

  • 200 annotated genomes of Streptococcus pneumoniae and Staphylococcus aureus
  • A wealth of information linked to each genome including:
    • Patient specific information including demographics (age, sex, race, ethnicity), underlying disease states (e.g., asthma, obesity, etc.), clinical presentation (e.g., pneumonia, sepsis), specific co-infection status (e.g., patient was co-infected with influenza), and a severity index
    • Strain specific information including site of isolation (e.g., blood), relevant antibiotic resistance patterns, community-acquired vs. health care-associated status, serotype / genotype, and a detailed passage history
  • A database of disease-specific virulence determinants and associated mutations identified through a novel, cutting edge, high-throughput method from 20 strains selected from this collection
  • A collection of characterized bacterial deletion / complementation mutants available for further research

Identification of Salmonella-specific Regulatory Networks

Dr. Wade’s group has identified 61 Salmonella-specific genes that encode Transcription Factors (TFs), of which 80% are completely unstudied. The goal of this project is to map the regulatory networks governed by Salmonella-specific TFs. We will identify TF binding sites using ChIP-seq and we will determine the effects of each TF on gene expression using a modified RNA-Seq method that maps RNA 5′ ends. In conjunction with PATRIC staff, we will analyze the resulting data in the context of existing genomic datasets for S. enterica. Thus, we will identify many novel regulatory pathways, including those that contribute to virulence gene regulation in S. enterica. These data will provide an important new level of annotation for the S. enterica genome and we anticipate that the same experimental and bioinformatic tools will then be used to uncover regulatory pathways in other bacterial species.