Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes

被引:804
作者
Nielsen, H. Bjorn [1 ,2 ]
Almeida, Mathieu [3 ,4 ,5 ]
Juncker, Agnieszka Sierakowska [1 ,2 ]
Rasmussen, Simon [1 ]
Li, Junhua [6 ,7 ,8 ]
Sunagawa, Shinichi [9 ]
Plichta, Damian R. [1 ]
Gautier, Laurent [1 ]
Pedersen, Anders G. [1 ]
Le Chatelier, Emmanuelle [3 ,4 ]
Pelletier, Eric [10 ,11 ,12 ]
Bonde, Ida [1 ,2 ]
Nielsen, Trine [13 ]
Manichanh, Chaysavanh [14 ]
Arumugam, Manimozhiyan [7 ,13 ]
Batto, Jean-Michel [3 ,4 ]
dos Santos, Marcelo B. Quintanilha [1 ]
Blom, Nikolaj [2 ]
Borruel, Natalia [14 ]
Burgdorf, Kristoffer S. [13 ]
Boumezbeur, Fouad [4 ]
Casellas, Francesc [14 ]
Dore, Joel [3 ,4 ]
Dworzynski, Piotr [1 ]
Guarner, Francisco [14 ]
Hansen, Torben [13 ,15 ]
Hildebrand, Falk [16 ,17 ]
Kaas, Rolf S. [18 ]
Kennedy, Sean [3 ,4 ]
Kristiansen, Karsten [19 ]
Kultima, Jens Roat [9 ]
Leonard, Pierre [4 ]
Levenez, Florence [3 ,4 ]
Lund, Ole [1 ]
Moumen, Bouziane [3 ,4 ]
Le Paslier, Denis [10 ,11 ,12 ]
Pons, Nicolas [3 ,4 ]
Pedersen, Oluf [13 ,20 ,21 ,22 ]
Prifti, Edi [3 ,4 ]
Qin, Junjie [6 ,7 ]
Raes, Jeroen [17 ,23 ,24 ]
Sorensen, Soren [25 ]
Tap, Julien [9 ]
Tims, Sebastian [26 ]
Ussery, David W. [1 ]
Yamada, Takuji [9 ,27 ]
Renault, Pierre [3 ]
Sicheritz-Ponten, Thomas [1 ,2 ]
Bork, Peer [9 ,28 ]
Wang, Jun [7 ,13 ,19 ,29 ]
机构
[1] Tech Univ Denmark, Ctr Biol Sequence Anal, Kongens Lyngby, Denmark
[2] Tech Univ Denmark, Novo Nordisk Fdn Ctr Biosustainabil, Kongens Lyngby, Denmark
[3] INRA, UMR 14121 MICALIS, Jouy En Josas, France
[4] INRA, US 1367 Metagenopolis, Jouy En Josas, France
[5] Univ Maryland, Ctr Bioinformat & Computat Biol, Dept Comp Sci, College Pk, MD USA
[6] BGI Hong Kong Res Inst, Hong Kong, Hong Kong, Peoples R China
[7] BGI Shenzhen, Shenzhen, Peoples R China
[8] S China Univ Technol, Sch Biosci & Biotechnol, Guangzhou, Guangdong, Peoples R China
[9] European Mol Biol Lab, D-69012 Heidelberg, Germany
[10] Inst Genom, Commissariat Energie Atom & Energies Alternat, Evry, France
[11] Ctr Natl Rech Scientif, Evry, France
[12] Univ Evry Val dEssonne, Evry, France
[13] Univ Copenhagen, Novo Nordisk Fdn Ctr Basic Metabol Res, Copenhagen, Denmark
[14] Univ Hosp Vall dHebron, Digest Syst Res Unit, Ciberehd, Barcelona, Spain
[15] Univ So Denmark, Fac Hlth Sci, Odense, Denmark
[16] VIB, Dept Biol Struct, Brussels, Belgium
[17] Vrije Univ Brussel, Dept Biosci Engn, Brussels, Belgium
[18] Tech Univ Denmark, Natl Food Inst, Div Epidemiol & Microbial Gen, Kongens Lyngby, Denmark
[19] Univ Copenhagen, Dept Biol, Copenhagen, Denmark
[20] Hagedorn Res Inst, Gentofte, Denmark
[21] Univ Copenhagen, Fac Hlth & Med Sci, Inst Biomed Sci, Copenhagen, Denmark
[22] Aarhus Univ, Fac Hlth, Aarhus, Denmark
[23] Katholieke Univ Leuven, Rega Inst, Dept Microbiol & Immunol, Louvain, Belgium
[24] VIB, Ctr Biol Dis, Leuven, Belgium
[25] Univ Copenhagen, Dept Biol, Microbiol Sect, Copenhagen, Denmark
[26] Wageningen Univ, Microbiol Lab, NL-6700 AP Wageningen, Netherlands
[27] Tokyo Inst Technol, Dept Biol Informat, Yokohama, Kanagawa 227, Japan
[28] Max Delbruck Ctr Mol Med, Berlin, Germany
[29] King Abdulaziz Univ, Princess Al Jawhara Ctr Excellence Res Hereditary, Jeddah 21413, Saudi Arabia
[30] Kings Coll London, Guys Hosp, Ctr Host Microbiome Interact, Dent Inst Cent Off, London WC2R 2LS, England
关键词
MICROBIOTA; SEQUENCE; TREE; SETS; TOOL;
D O I
10.1038/nbt.2939
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, but the microbial diversity of many environments extends far beyond what is covered by reference databases. De novo segregation of complex metagenomic data into specific biological entities, such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences. We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS). We use these to assemble 238 high-quality microbial genomes and identify affiliations between MGS and hundreds of viruses or genetic entities. Our method provides the means for comprehensive profiling of the diversity within complex metagenomic samples.
引用
收藏
页码:822 / 828
页数:7
相关论文
共 50 条
[1]   Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes [J].
Albertsen, Mads ;
Hugenholtz, Philip ;
Skarshewski, Adam ;
Nielsen, Kare L. ;
Tyson, Gene W. ;
Nielsen, Per H. .
NATURE BIOTECHNOLOGY, 2013, 31 (06) :533-+
[2]  
[Anonymous], 2003, 3 INT WORKSHOP DISTR
[3]  
[Anonymous], GENOME RES
[4]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[5]   CRISPR Recognition Tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats [J].
Bland, Charles ;
Ramsey, Teresa L. ;
Sabree, Fareedah ;
Lowe, Micheal ;
Brown, Kyndall ;
Kyrpides, Nikos C. ;
Hugenholtz, Philip .
BMC BIOINFORMATICS, 2007, 8 (1)
[6]   Genome Project Standards in a New Era of Sequencing [J].
Chain, P. S. G. ;
Grafham, D. V. ;
Fulton, R. S. ;
FitzGerald, M. G. ;
Hostetler, J. ;
Muzny, D. ;
Ali, J. ;
Birren, B. ;
Bruce, D. C. ;
Buhay, C. ;
Cole, J. R. ;
Ding, Y. ;
Dugan, S. ;
Field, D. ;
Garrity, G. M. ;
Gibbs, R. ;
Graves, T. ;
Han, C. S. ;
Harrison, S. H. ;
Highlander, S. ;
Hugenholtz, P. ;
Khouri, H. M. ;
Kodira, C. D. ;
Kolker, E. ;
Kyrpides, N. C. ;
Lang, D. ;
Lapidus, A. ;
Malfatti, S. A. ;
Markowitz, V. ;
Metha, T. ;
Nelson, K. E. ;
Parkhill, J. ;
Pitluck, S. ;
Qin, X. ;
Read, T. D. ;
Schmutz, J. ;
Sozhamannan, S. ;
Sterk, P. ;
Strausberg, R. L. ;
Sutton, G. ;
Thomson, N. R. ;
Tiedje, J. M. ;
Weinstock, G. ;
Wollam, A. ;
Detter, J. C. .
SCIENCE, 2009, 326 (5950) :236-237
[7]   Genome Sequence of the Probiotic Strain Bifidobacterium animalis subsp lactis CNCM I-2494 [J].
Chervaux, Christian ;
Grimaldi, Christine ;
Bolotin, Alexander ;
Quinquis, Benoit ;
Legrain-Raspaud, Sophie ;
Vlieg, Johan E. T. van Hylckama ;
Denariaz, Gerard ;
Smokvina, Tamara .
JOURNAL OF BACTERIOLOGY, 2011, 193 (19) :5560-+
[8]   Toward automatic reconstruction of a highly resolved tree of life [J].
Ciccarelli, FD ;
Doerks, T ;
von Mering, C ;
Creevey, CJ ;
Snel, B ;
Bork, P .
SCIENCE, 2006, 311 (5765) :1283-1287
[9]   Assemblathon 1: A competitive assessment of de novo short read assembly methods [J].
Earl, Dent ;
Bradnam, Keith ;
St John, John ;
Darling, Aaron ;
Lin, Dawei ;
Fass, Joseph ;
Hung On Ken Yu ;
Buffalo, Vince ;
Zerbino, Daniel R. ;
Diekhans, Mark ;
Ngan Nguyen ;
Ariyaratne, Pramila Nuwantha ;
Sung, Wing-Kin ;
Ning, Zemin ;
Haimel, Matthias ;
Simpson, Jared T. ;
Fonseca, Nuno A. ;
Birol, Inanc ;
Docking, T. Roderick ;
Ho, Isaac Y. ;
Rokhsar, Daniel S. ;
Chikhi, Rayan ;
Lavenier, Dominique ;
Chapuis, Guillaume ;
Naquin, Delphine ;
Maillet, Nicolas ;
Schatz, Michael C. ;
Kelley, David R. ;
Phillippy, Adam M. ;
Koren, Sergey ;
Yang, Shiaw-Pyng ;
Wu, Wei ;
Chou, Wen-Chi ;
Srivastava, Anuj ;
Shaw, Timothy I. ;
Ruby, J. Graham ;
Skewes-Cox, Peter ;
Betegon, Miguel ;
Dimon, Michelle T. ;
Solovyev, Victor ;
Seledtsov, Igor ;
Kosarev, Petr ;
Vorobyev, Denis ;
Ramirez-Gonzalez, Ricardo ;
Leggett, Richard ;
MacLean, Dan ;
Xia, Fangfang ;
Luo, Ruibang ;
Li, Zhenyu ;
Xie, Yinlong .
GENOME RESEARCH, 2011, 21 (12) :2224-2241
[10]   HMMER web server: interactive sequence similarity searching [J].
Finn, Robert D. ;
Clements, Jody ;
Eddy, Sean R. .
NUCLEIC ACIDS RESEARCH, 2011, 39 :W29-W37