A unified catalog of 204,938 reference genomes from the human gut microbiome

被引:654
作者
Almeida, Alexandre [1 ,2 ]
Nayfach, Stephen [3 ,4 ]
Boland, Miguel [1 ]
Strozzi, Francesco [5 ]
Beracochea, Martin [1 ]
Shi, Zhou Jason [6 ,7 ]
Pollard, Katherine S. [6 ,7 ,8 ,9 ,10 ,11 ]
Sakharova, Ekaterina [1 ]
Parks, Donovan H. [12 ]
Hugenholtz, Philip [12 ]
Segata, Nicola [13 ]
Kyrpides, Nikos C. [3 ,4 ]
Finn, Robert D. [1 ]
机构
[1] European Bioinformat Inst EMBL EBI, Wellcome Genome Campus, Hinxton, England
[2] Wellcome Sanger Inst, Wellcome Genome Campus, Hinxton, England
[3] US DOE, Joint Genome Inst, Walnut Creek, CA USA
[4] Lawrence Berkeley Natl Lab, Environm Genom & Syst Biol Div, Berkeley, CA USA
[5] Enterome Biosci, Paris, France
[6] Gladstone Inst, San Francisco, CA USA
[7] Chan Zuckerberg Biohub, San Francisco, CA USA
[8] Univ Calif San Francisco, Inst Human Genet, San Francisco, CA 94143 USA
[9] Univ Calif San Francisco, Inst Computat Hlth Sci, San Francisco, CA 94143 USA
[10] Univ Calif San Francisco, Quantitat Biol, San Francisco, CA 94143 USA
[11] Univ Calif San Francisco, Dept Epidemiol & Biostat, San Francisco, CA USA
[12] Univ Queensland, Sch Chem & Mol Biosci, Australian Ctr Ecogen, Brisbane, Qld, Australia
[13] Univ Trento, CIBIO Dept, Trento, Italy
基金
欧洲研究理事会; 英国生物技术与生命科学研究理事会;
关键词
READ ALIGNMENT; ASSEMBLED GENOMES; ANALYSIS RESOURCE; ANNOTATION; BACTERIAL; GENES; VERSATILE; COVERAGE; DATABASE; CULTURE;
D O I
10.1038/s41587-020-0603-3
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Comprehensive, high-quality reference genomes are required for functional characterization and taxonomic assignment of the human gut microbiota. We present the Unified Human Gastrointestinal Genome (UHGG) collection, comprising 204,938 nonredundant genomes from 4,644 gut prokaryotes. These genomes encode >170 million protein sequences, which we collated in the Unified Human Gastrointestinal Protein (UHGP) catalog. The UHGP more than doubles the number of gut proteins in comparison to those present in the Integrated Gene Catalog. More than 70% of the UHGG species lack cultured representatives, and 40% of the UHGP lack functional annotations. Intraspecies genomic variation analyses revealed a large reservoir of accessory genes and single-nucleotide variants, many of which are specific to individual human populations. The UHGG and UHGP collections will enable studies linking genotypes to phenotypes in the human gut microbiome.
引用
收藏
页码:105 / 114
页数:10
相关论文
共 75 条
  • [11] GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database
    Chaumeil, Pierre-Alain
    Mussig, Aaron J.
    Hugenholtz, Philip
    Parks, Donovan H.
    [J]. BIOINFORMATICS, 2020, 36 (06) : 1925 - 1927
  • [12] IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes
    Chen, I-Min A.
    Chu, Ken
    Palaniappan, Krishna
    Pillay, Manoj
    Ratner, Anna
    Huang, Jinghua
    Huntemann, Marcel
    Varghese, Neha
    White, James R.
    Seshadri, Rekha
    Smirnova, Tatyana
    Kirton, Edward
    Jungbluth, Sean P.
    Woyke, Tanja
    Eloe-Fadrosh, Emiley A.
    Ivanova, Natalia N.
    Kyrpides, Nikos C.
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) : D666 - D677
  • [13] Accurate and complete genomes from metagenomes
    Chen, Lin-Xing
    Anantharaman, Karthik
    Shaiber, Alon
    Eren, A. Murat
    Banfield, Jillian F.
    [J]. GENOME RESEARCH, 2020, 30 (03) : 315 - 333
  • [14] The human gut and groundwater harbor non-photosynthetic bacteria belonging to a new candidate phylum sibling to Cyanobacteria
    Di Rienzi, Sara C.
    Sharon, Itai
    Wrighton, Kelly C.
    Koren, Omry
    Hug, Laura A.
    Thomas, Brian C.
    Goodrich, Julia K.
    Bell, Jordana T.
    Spector, Timothy D.
    Banfield, Jillian F.
    Ley, Ruth E.
    [J]. ELIFE, 2013, 2
  • [15] Gut microbiome development along the colorectal adenoma-carcinoma sequence
    Feng, Qiang
    Liang, Suisha
    Jia, Huijue
    Stadlmayr, Andreas
    Tang, Longqing
    Lan, Zhou
    Zhang, Dongya
    Xia, Huihua
    Xu, Xiaoying
    Jie, Zhuye
    Su, Lili
    Li, Xiaoping
    Li, Xin
    Li, Junhua
    Xiao, Liang
    Huber-Schoenauer, Ursula
    Niederseer, David
    Xu, Xun
    Al-Aama, Jumana Yousuf
    Yang, Huanming
    Wang, Jian
    Kristiansen, Karsten
    Arumugam, Manimozhiyan
    Tilg, Herbert
    Datz, Christian
    Wang, Jun
    [J]. NATURE COMMUNICATIONS, 2015, 6
  • [16] A human gut bacterial genome and culture collection for improved metagenomic analyses
    Forster, Samuel C.
    Kumar, Nitin
    Anonye, Blessing O.
    Almeida, Alexandre
    Viciani, Elisa
    Stares, Mark D.
    Dunn, Matthew
    Mkandawire, Tapoka T.
    Zhu, Ana
    Shao, Yan
    Pike, Lindsay J.
    Louie, Thomas
    Browne, Hilary P.
    Mitchell, Alex L.
    Neville, B. Anne
    Finn, Robert D.
    Lawley, Trevor D.
    [J]. NATURE BIOTECHNOLOGY, 2019, 37 (02) : 186 - +
  • [17] Expanded microbial genome coverage and improved protein family annotation in the COG database
    Galperin, Michael Y.
    Makarova, Kira S.
    Wolf, Yuri I.
    Koonin, Eugene V.
    [J]. NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) : D261 - D269
  • [18] It's all relative: analyzing microbiome data as compositions
    Gloor, Gregory B.
    Wu, Jia Rong
    Pawlowsky-Glahn, Vera
    Jose Egozcue, Juan
    [J]. ANNALS OF EPIDEMIOLOGY, 2016, 26 (05) : 322 - 329
  • [19] Antibiotics-induced monodominance of a novel gut bacterial order
    Hildebrand, Falk
    Moitinho-Silva, Lucas
    Blasche, Sonja
    Jahn, Martin T.
    Gossmann, Toni Ingolf
    Huerta-Cepas, Jaime
    Hercog, Rajna
    Luetge, Mechthild
    Bahram, Mohammad
    Pryszlak, Anna
    Alves, Renato J.
    Waszak, Sebastian M.
    Zhu, Ana
    Ye, Lumeng
    Costea, Paul Igor
    Aalvink, Steven
    Belzer, Clara
    Forslund, Sofia K.
    Sunagawa, Shinichi
    Hentschel, Ute
    Merten, Christoph
    Patil, Kiran Raosaheb
    Benes, Vladimir
    Bork, Peer
    [J]. GUT, 2019, 68 (10) : 1781 - 1790
  • [20] eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses
    Huerta-Cepas, Jaime
    Szklarczyk, Damian
    Heller, Davide
    Hernandez-Plaza, Ana
    Forslund, Sofia K.
    Cook, Helen
    Mende, Daniel R.
    Letunic, Ivica
    Rattei, Thomas
    Jensen, Lars J.
    von Mering, Christian
    Bork, Peer
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) : D309 - D314