Multi-omics integration accurately predicts cellular state in unexplored conditions for Escherichia coli

被引:106
作者
Kim, Minseung [1 ,2 ]
Rai, Navneet [2 ]
Zorraquino, Violeta [2 ]
Tagkopoulos, Ilias [1 ,2 ]
机构
[1] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
[2] Univ Calif Davis, Genome Ctr, Davis, CA 95616 USA
基金
美国国家科学基金会;
关键词
GENE-EXPRESSION; PROTEIN; MICROARRAY; GROWTH; DATABASE; MODEL; NORMALIZATION; INFORMATION; VALIDATION; METABOLISM;
D O I
10.1038/ncomms13090
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A significant obstacle in training predictive cell models is the lack of integrated data sources. We develop semi-supervised normalization pipelines and perform experimental characterization (growth, transcriptional, proteome) to create Ecomics, a consistent, quality-controlled multi-omics compendium for Escherichia coli with cohesive meta-data information. We then use this resource to train a multi-scale model that integrates four omics layers to predict genome-wide concentrations and growth dynamics. The genetic and environmental ontology reconstructed from the omics data is substantially different and complementary to the genetic and chemical ontologies. The integration of different layers confers an incremental increase in the prediction performance, as does the information about the known gene regulatory and protein-protein interactions. The predictive performance of the model ranges from 0.54 to 0.87 for the various omics layers, which far exceeds various baselines. This work provides an integrative framework of omics-driven predictive modelling that is broadly applicable to guide biological discovery.
引用
收藏
页数:12
相关论文
共 71 条
  • [1] Predicting Cellular Growth from Gene Expression Signatures
    Airoldi, Edoardo M.
    Huttenhower, Curtis
    Gresham, David
    Lu, Charles
    Caudy, Amy A.
    Dunham, Maitreya J.
    Broach, James R.
    Botstein, David
    Troyanskaya, Olga G.
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (01)
  • [2] HTSeq-a Python']Python framework to work with high-throughput sequencing data
    Anders, Simon
    Pyl, Paul Theodor
    Huber, Wolfgang
    [J]. BIOINFORMATICS, 2015, 31 (02) : 166 - 169
  • [3] [Anonymous], PHYSL BACTERIAL CELL
  • [4] [Anonymous], KNOWLEDGE BASE COMPU
  • [5] [Anonymous], 2015, NUCLEIC ACIDS RES, V43, pD1049
  • [6] [Anonymous], MOL ORG CELL FUNCTIO
  • [7] Activities at the Universal Protein Resource (UniProt)
    Apweiler, Rolf
    Bateman, Alex
    Martin, Maria Jesus
    O'Donovan, Claire
    Magrane, Michele
    Alam-Faruque, Yasmin
    Alpi, Emanuele
    Antunes, Ricardo
    Arganiska, Joanna
    Casanova, Elisabet Barrera
    Bely, Benoit
    Bingley, Mark
    Bonilla, Carlos
    Britto, Ramona
    Bursteinas, Borisas
    Chan, Wei Mun
    Chavali, Gayatri
    Cibrian-Uhalte, Elena
    Da Silva, Alan
    De Giorgi, Maurizio
    Dogan, Tunca
    Fazzini, Francesco
    Gane, Paul
    Castro, Leyla Garcia
    Garmiri, Penelope
    Hatton-Ellis, Emma
    Hieta, Reija
    Huntley, Rachael
    Legge, Duncan
    Liu, Wudong
    Luo, Jie
    MacDougall, Alistair
    Mutowo, Prudence
    Nightingale, Andrew
    Orchard, Sandra
    Pichler, Klemens
    Poggioli, Diego
    Pundir, Sangya
    Pureza, Luis
    Qi, Guoying
    Rosanoff, Steven
    Saidi, Rabie
    Sawford, Tony
    Shypitsyna, Aleksandra
    Turner, Edward
    Volynkin, Vladimir
    Wardell, Tony
    Watkins, Xavier
    Zellner, Hermann
    Corbett, Matt
    [J]. NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) : D191 - D198
  • [8] Absolute metabolite concentrations and implied enzyme active site occupancy in Escherichia coli
    Bennett, Bryson D.
    Kimball, Elizabeth H.
    Gao, Melissa
    Osterhout, Robin
    Van Dien, Stephen J.
    Rabinowitz, Joshua D.
    [J]. NATURE CHEMICAL BIOLOGY, 2009, 5 (08) : 593 - 599
  • [9] Indole-3-acetic acid regulates the central metabolic pathways in Escherichia coli
    Bianco, C.
    Imperlini, E.
    Calogero, R.
    Senatore, B.
    Pucci, P.
    Defez, R.
    [J]. MICROBIOLOGY-SGM, 2006, 152 : 2421 - 2431
  • [10] Trimmomatic: a flexible trimmer for Illumina sequence data
    Bolger, Anthony M.
    Lohse, Marc
    Usadel, Bjoern
    [J]. BIOINFORMATICS, 2014, 30 (15) : 2114 - 2120