The y-ome defines the 35% of Escherichia coli genes that lack experimental evidence of function

被引:90
作者
Ghatak, Sankha [1 ]
King, Zachary A. [1 ]
Sastry, Anand [1 ]
Palsson, Bernhard O. [1 ,2 ,3 ]
机构
[1] Univ Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
[2] Univ Calif San Diego, Dept Pediat, La Jolla, CA 92093 USA
[3] Tech Univ Denmark, Novo Nord Fdn Ctr Biosustainabil, Bldg 220, DK-2800 Lyngby, Denmark
基金
美国国家科学基金会;
关键词
BETA-OXIDATION CYCLE; K-12; CELL; EVOLUTION; REVERSAL; NETWORK;
D O I
10.1093/nar/gkz030
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Experimental studies of Escherichia coli K-12 MG1655 often implicate poorly annotated genes in cellular phenotypes. However, we lack a systematic understanding of these genes. How many are there? What information is available for them? And what features do they share that could explain the gap in our understanding? Efforts to build predictive, whole-cell models of E. coli inevitably face this knowledge gap. We approached these questions systematically by assembling annotations from the knowledge bases EcoCyc, EcoGene, UniProt and RegulonDB. We identified the genes that lack experimental evidence of function (the y-ome') which include 1600 of 4623 unique genes (34.6%), of which 111 have absolutely no evidence of function. An additional 220 genes (4.7%) are pseudogenes or phantom genes. y-ome genes tend to have lower expression levels and are enriched in the termination region of the E. coli chromosome. Where evidence is available for y-ome genes, it most often points to them being membrane proteins and transporters. We resolve the misconception that a gene in E. coli whose primary name starts with y' is unannotated, and we discuss the value of the y-ome for systematic improvement of E. coli knowledge bases and its extension to other organisms.
引用
收藏
页码:2446 / 2454
页数:9
相关论文
共 54 条
  • [11] Why Build Whole-Cell Models?
    Carrera, Javier
    Covert, Markus W.
    [J]. TRENDS IN CELL BIOLOGY, 2015, 25 (12) : 719 - 722
  • [12] COMBREX-DB: an experiment centered database of protein function: knowledge, predictions and knowledge gaps
    Chang, Yi-Chien
    Hu, Zhenjun
    Rachlin, John
    Anton, Brian P.
    Kasif, Simon
    Roberts, Richard J.
    Steffen, Martin
    [J]. NUCLEIC ACIDS RESEARCH, 2016, 44 (D1) : D330 - D335
  • [13] Chibucos MC, 2017, METHODS MOL BIOL, V1446, P245, DOI 10.1007/978-1-4939-3743-1_18
  • [14] In silico assessment of the metabolic capabilities of an engineered functional reversal of the β-oxidation cycle for the synthesis of longer-chain (C ≥ 4) products
    Cintolesi, Angela
    Clomburg, James M.
    Gonzalez, Ramon
    [J]. METABOLIC ENGINEERING, 2014, 23 : 100 - 115
  • [15] Unknown unknowns: essential genes in quest for function
    Danchin, Antoine
    Fang, Gang
    [J]. MICROBIAL BIOTECHNOLOGY, 2016, 9 (05): : 530 - 540
  • [16] Engineered reversal of the β-oxidation cycle for the synthesis of fuels and chemicals
    Dellomonaco, Clementina
    Clomburg, James M.
    Miller, Elliot N.
    Gonzalez, Ramon
    [J]. NATURE, 2011, 476 (7360) : 355 - U131
  • [17] Long range chromosome organization in Escherichia coli: The position of the replication origin defines the non-structured regions and the Right and Left macrodomains
    Duigou, Stephane
    Boccard, Frederic
    [J]. PLOS GENETICS, 2017, 13 (05):
  • [18] TFpredict and SABINE: Sequence-Based Prediction of Structural and Functional Characteristics of Transcription Factors
    Eichner, Johannes
    Topf, Florian
    Draeger, Andreas
    Wrzodek, Clemens
    Wanke, Dierk
    Zell, Andreas
    [J]. PLOS ONE, 2013, 8 (12):
  • [19] Determinants of tRNA Recognition by the Radical SAM Enzyme RlmN
    Fitzsimmons, Christina M.
    Fujimori, Danica Galonic
    [J]. PLOS ONE, 2016, 11 (11):
  • [20] Genomewide landscape of gene-metabolome associations in Escherichia coli
    Fuhrer, Tobias
    Zampieri, Mattia
    Sevin, Daniel C.
    Sauer, Uwe
    Zamboni, Nicola
    [J]. MOLECULAR SYSTEMS BIOLOGY, 2017, 13 (01)