The y-ome defines the 35% of Escherichia coli genes that lack experimental evidence of function

被引:90
作者
Ghatak, Sankha [1 ]
King, Zachary A. [1 ]
Sastry, Anand [1 ]
Palsson, Bernhard O. [1 ,2 ,3 ]
机构
[1] Univ Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
[2] Univ Calif San Diego, Dept Pediat, La Jolla, CA 92093 USA
[3] Tech Univ Denmark, Novo Nord Fdn Ctr Biosustainabil, Bldg 220, DK-2800 Lyngby, Denmark
基金
美国国家科学基金会;
关键词
BETA-OXIDATION CYCLE; K-12; CELL; EVOLUTION; REVERSAL; NETWORK;
D O I
10.1093/nar/gkz030
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Experimental studies of Escherichia coli K-12 MG1655 often implicate poorly annotated genes in cellular phenotypes. However, we lack a systematic understanding of these genes. How many are there? What information is available for them? And what features do they share that could explain the gap in our understanding? Efforts to build predictive, whole-cell models of E. coli inevitably face this knowledge gap. We approached these questions systematically by assembling annotations from the knowledge bases EcoCyc, EcoGene, UniProt and RegulonDB. We identified the genes that lack experimental evidence of function (the y-ome') which include 1600 of 4623 unique genes (34.6%), of which 111 have absolutely no evidence of function. An additional 220 genes (4.7%) are pseudogenes or phantom genes. y-ome genes tend to have lower expression levels and are enriched in the termination region of the E. coli chromosome. Where evidence is available for y-ome genes, it most often points to them being membrane proteins and transporters. We resolve the misconception that a gene in E. coli whose primary name starts with y' is unannotated, and we discuss the value of the y-ome for systematic improvement of E. coli knowledge bases and its extension to other organisms.
引用
收藏
页码:2446 / 2454
页数:9
相关论文
共 54 条
  • [1] Long-range periodic patterns in microbial genomes indicate significant multi-scale chromosomal organization
    Allen, Timothy E.
    Price, Nathan D.
    Joyce, Andrew R.
    Palsson, Bernhard O.
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2006, 2 (01) : 13 - 21
  • [2] [Anonymous], NAT METHODS
  • [3] [Anonymous], P NATL ACAD SCI US
  • [4] [Anonymous], NAT BIOTECHNOL
  • [5] The COMBREX Project: Design, Methodology, and Initial Results
    Anton, Brian P.
    Chang, Yi-Chien
    Brown, Peter
    Choi, Han-Pil
    Faller, Lina L.
    Guleria, Jyotsna
    Hu, Zhenjun
    Klitgord, Niels
    Levy-Moonshine, Ami
    Maksad, Almaz
    Mazumdar, Varun
    McGettrick, Mark
    Osmani, Lais
    Pokrzywa, Revonda
    Rachlin, John
    Swaminathan, Rajeswari
    Allen, Benjamin
    Housman, Genevieve
    Monahan, Caitlin
    Rochussen, Krista
    Tao, Kevin
    Bhagwat, Ashok S.
    Brenner, Steven E.
    Columbus, Linda
    de Crecy-Lagard, Valerie
    Ferguson, Donald
    Fomenkov, Alexey
    Gadda, Giovanni
    Morgan, Richard D.
    Osterman, Andrei L.
    Rodionov, Dmitry A.
    Rodionova, Irina A.
    Rudd, Kenneth E.
    Soll, Dieter
    Spain, James
    Xu, Shuang-Yong
    Bateman, Alex
    Blumenthal, Robert M.
    Bollinger, J. Martin
    Chang, Woo-Suk
    Ferrer, Manuel
    Friedberg, Iddo
    Galperin, Michael Y.
    Gobeill, Julien
    Haft, Daniel
    Hunt, John
    Karp, Peter
    Klimke, William
    Krebs, Carsten
    Macelis, Dana
    [J]. PLOS BIOLOGY, 2013, 11 (08):
  • [6] The fractured landscape of RNA-seq alignment: the default in our STARs
    Ballouz, Sara
    Dobin, Alexander
    Gingeras, Thomas R.
    Gillis, Jesse
    [J]. NUCLEIC ACIDS RESEARCH, 2018, 46 (10) : 5125 - 5138
  • [7] NCBI GEO: archive for functional genomics data sets-update
    Barrett, Tanya
    Wilhite, Stephen E.
    Ledoux, Pierre
    Evangelista, Carlos
    Kim, Irene F.
    Tomashevsky, Maxim
    Marshall, Kimberly A.
    Phillippy, Katherine H.
    Sherman, Patti M.
    Holko, Michelle
    Yefanov, Andrey
    Lee, Hyeseung
    Zhang, Naigong
    Robertson, Cynthia L.
    Serova, Nadezhda
    Davis, Sean
    Soboleva, Alexandra
    [J]. NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) : D991 - D995
  • [8] UniProt: a hub for protein information
    Bateman, Alex
    Martin, Maria Jesus
    O'Donovan, Claire
    Magrane, Michele
    Apweiler, Rolf
    Alpi, Emanuele
    Antunes, Ricardo
    Arganiska, Joanna
    Bely, Benoit
    Bingley, Mark
    Bonilla, Carlos
    Britto, Ramona
    Bursteinas, Borisas
    Chavali, Gayatri
    Cibrian-Uhalte, Elena
    Da Silva, Alan
    De Giorgi, Maurizio
    Dogan, Tunca
    Fazzini, Francesco
    Gane, Paul
    Cas-tro, Leyla Garcia
    Garmiri, Penelope
    Hatton-Ellis, Emma
    Hieta, Reija
    Huntley, Rachael
    Legge, Duncan
    Liu, Wudong
    Luo, Jie
    MacDougall, Alistair
    Mutowo, Prudence
    Nightin-gale, Andrew
    Orchard, Sandra
    Pichler, Klemens
    Poggioli, Diego
    Pundir, Sangya
    Pureza, Luis
    Qi, Guoying
    Rosanoff, Steven
    Saidi, Rabie
    Sawford, Tony
    Shypitsyna, Aleksandra
    Turner, Edward
    Volynkin, Vladimir
    Wardell, Tony
    Watkins, Xavier
    Zellner, Hermann
    Cowley, Andrew
    Figueira, Luis
    Li, Weizhong
    McWilliam, Hamish
    [J]. NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) : D204 - D212
  • [9] Constraint-based models predict metabolic and associated cellular functions
    Bordbar, Aarash
    Monk, Jonathan M.
    King, Zachary A.
    Palsson, Bernhard O.
    [J]. NATURE REVIEWS GENETICS, 2014, 15 (02) : 107 - 120
  • [10] Chromosome position effects on gene expression in Escherichia coli K-12
    Bryant, Jack A.
    Sellars, Laura E.
    Busby, Stephen J. W.
    Lee, David J.
    [J]. NUCLEIC ACIDS RESEARCH, 2014, 42 (18) : 11383 - 11392