EMIRGE: reconstruction of full-length ribosomal genes from microbial community short read sequencing data

被引:267
作者
Miller, Christopher S. [1 ]
Baker, Brett J. [1 ]
Thomas, Brian C. [1 ]
Singer, Steven W. [2 ,3 ]
Banfield, Jillian F. [1 ,4 ]
机构
[1] Univ Calif Berkeley, Dept Earth & Planetary Sci, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Lawrence Berkeley Lab, Div Earth Sci, Berkeley, CA 94720 USA
[3] Joint BioEnergy Inst, Deconstruct Div, Emeryville, CA 94660 USA
[4] Univ Calif Berkeley, Dept Environm Sci Policy & Management, Berkeley, CA 94720 USA
来源
GENOME BIOLOGY | 2011年 / 12卷 / 05期
关键词
MAXIMUM-LIKELIHOOD; RARE BIOSPHERE; RNA GENES; DEEP-SEA; DIVERSITY; AMPLIFICATION; ALIGNMENT; GENOME; ULTRAFAST; WRINKLES;
D O I
10.1186/gb-2011-12-5-r44
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Recovery of ribosomal small subunit genes by assembly of short read community DNA sequence data generally fails, making taxonomic characterization difficult. Here, we solve this problem with a novel iterative method, based on the expectation maximization algorithm, that reconstructs full-length small subunit gene sequences and provides estimates of relative taxon abundances. We apply the method to natural and simulated microbial communities, and correctly recover community structure from known and previously unreported rRNA gene sequences. An implementation of the method is freely available at https://github.com/csmiller/EMIRGE.
引用
收藏
页数:14
相关论文
共 53 条
[31]   MEGA: A biologist-centric software for evolutionary analysis of DNA and protein sequences [J].
Kumar, Sudhir ;
Nei, Masatoshi ;
Dudley, Joel ;
Tamura, Koichiro .
BRIEFINGS IN BIOINFORMATICS, 2008, 9 (04) :299-306
[32]   Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates [J].
Kunin, Victor ;
Engelbrektson, Anna ;
Ochman, Howard ;
Hugenholtz, Philip .
ENVIRONMENTAL MICROBIOLOGY, 2010, 12 (01) :118-123
[33]   Ultrafast and memory-efficient alignment of short DNA sequences to the human genome [J].
Langmead, Ben ;
Trapnell, Cole ;
Pop, Mihai ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2009, 10 (03)
[34]   Metagenomic study of the oral microbiota by Illumina high-throughput sequencing [J].
Lazarevic, Vladimir ;
Whiteson, Katrine ;
Huse, Susan ;
Hernandez, David ;
Farinelli, Laurent ;
Osteras, Magne ;
Schrenzel, Jacques ;
Francois, Patrice .
JOURNAL OF MICROBIOLOGICAL METHODS, 2009, 79 (03) :266-271
[35]   Fast and accurate short read alignment with Burrows-Wheeler transform [J].
Li, Heng ;
Durbin, Richard .
BIOINFORMATICS, 2009, 25 (14) :1754-1760
[36]   SOAP2: an improved ultrafast tool for short read alignment [J].
Li, Ruiqiang ;
Yu, Chang ;
Li, Yingrui ;
Lam, Tak-Wah ;
Yiu, Siu-Ming ;
Kristiansen, Karsten ;
Wang, Jun .
BIOINFORMATICS, 2009, 25 (15) :1966-1967
[37]   Strain-resolved community proteomics reveals recombining genomes of acidophilic bacteria [J].
Lo, Ian ;
Denef, Vincent J. ;
VerBerkmoes, Nathan C. ;
Shah, Manesh B. ;
Goltsman, Daniela ;
DiBartolo, Genevieve ;
Tyson, Gene W. ;
Allen, Eric E. ;
Ram, Rachna J. ;
Detter, J. Chris ;
Richardson, Paul ;
Thelen, Michael P. ;
Hettich, Robert L. ;
Banfield, Jillian F. .
NATURE, 2007, 446 (7135) :537-541
[38]   Quantitative and qualitative β diversity measures lead to different insights into factors that structure microbial communities [J].
Lozupone, Catherine A. ;
Hamady, Micah ;
Kelley, Scott T. ;
Knight, Rob .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2007, 73 (05) :1576-1585
[39]   Metagenomic Sequencing of an In Vitro-Simulated Microbial Community [J].
Morgan, Jenna L. ;
Darling, Aaron E. ;
Eisen, Jonathan A. .
PLOS ONE, 2010, 5 (04)
[40]   Iterative Correction of Reference Nucleotides (iCORN) using second generation sequencing technology [J].
Otto, Thomas D. ;
Sanders, Mandy ;
Berriman, Matthew ;
Newbold, Chris .
BIOINFORMATICS, 2010, 26 (14) :1704-1707