Mixture models for analysis of the taxonomic composition of metagenomes

被引:41
作者
Meinicke, Peter [1 ]
Asshauer, Kathrin Petra [1 ]
Lingner, Thomas [1 ]
机构
[1] Univ Gottingen, Inst Microbiol & Genet, Dept Bioinformat, Gottingen, Germany
关键词
PHYLOGENETIC CLASSIFICATION; ENCYCLOPEDIA; DIVERSITY; RESOURCE; GENES; SETS;
D O I
10.1093/bioinformatics/btr266
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Inferring the taxonomic profile of a microbial community from a large collection of anonymous DNA sequencing reads is a challenging task in metagenomics. Because existing methods for taxonomic profiling of metagenomes are all based on the assignment of fragmentary sequences to phylogenetic categories, the accuracy of results largely depends on fragment length. This dependence complicates comparative analysis of data originating from different sequencing platforms or resulting from different preprocessing pipelines. Results: We here introduce a new method for taxonomic profiling based on mixture modeling of the overall oligonucleotide distribution of a sample. Our results indicate that the mixture-based profiles compare well with taxonomic profiles obtained with other methods. However, in contrast to the existing methods, our approach shows a nearly constant profiling accuracy across all kinds of read lengths and it operates at an unrivaled speed.
引用
收藏
页码:1618 / 1624
页数:7
相关论文
共 36 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   Bacterial rhodopsin:: Evidence for a new type of phototrophy in the sea [J].
Béjà, O ;
Aravind, L ;
Koonin, EV ;
Suzuki, MT ;
Hadd, A ;
Nguyen, LP ;
Jovanovich, S ;
Gates, CM ;
Feldman, RA ;
Spudich, JL ;
Spudich, EN ;
DeLong, EF .
SCIENCE, 2000, 289 (5486) :1902-1906
[3]   Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering [J].
Bohlin, Jon ;
Skjerve, Eystein ;
Ussery, David W. .
BMC GENOMICS, 2009, 10 :487
[4]  
Brady A, 2009, NAT METHODS, V6, P673, DOI [10.1038/nmeth.1358, 10.1038/NMETH.1358]
[5]  
Canu S., 2005, Perception Systmes et Information
[6]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[7]   TACOA - Taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach [J].
Diaz, Naryttza N. ;
Krause, Lutz ;
Goesmann, Alexander ;
Niehaus, Karsten ;
Nattkemper, Tim W. .
BMC BIOINFORMATICS, 2009, 10
[8]   WebCARMA: a web application for the functional and taxonomic classification of unassembled metagenomic reads [J].
Gerlach, Wolfgang ;
Juenemann, Sebastian ;
Tille, Felix ;
Goesmann, Alexander ;
Stoye, Jens .
BMC BIOINFORMATICS, 2009, 10
[9]   Metagenomic analysis of the human distal gut microbiome [J].
Gill, Steven R. ;
Pop, Mihai ;
DeBoy, Robert T. ;
Eckburg, Paul B. ;
Turnbaugh, Peter J. ;
Samuel, Buck S. ;
Gordon, Jeffrey I. ;
Relman, David A. ;
Fraser-Liggett, Claire M. ;
Nelson, Karen E. .
SCIENCE, 2006, 312 (5778) :1355-1359
[10]   Polymerase chain reaction primers miss half of rRNA microbial diversity [J].
Hong, SunHee ;
Bunge, John ;
Leslin, Chesley ;
Jeon, Sunok ;
Epstein, Slava S. .
ISME JOURNAL, 2009, 3 (12) :1365-1373