Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences

被引:242
作者
Edgar, Robert C.
机构
[1] Sonoma, CA
关键词
Microbiome; Taxonomy; Algorithm; Benchmark; INTERNAL TRANSCRIBED SPACER; HUMAN MICROBIOME; BAYESIAN CLASSIFIER; IDENTIFICATION; DATABASE; GREENGENES; DIVERSITY; BACTERIA; SUBUNIT; SEARCH;
D O I
10.7717/peerj.4652
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Prediction of taxonomy for marker gene sequences such as 16S ribosomal RNA (rRNA) is a fundamental task in microbiology. Most experimentally observed sequences are diverged from reference sequences of authoritatively named organisms, creating a challenge for prediction methods. I assessed the accuracy of several algorithms using cross-validation by identity, a new benchmark strategy which explicitly models the variation in distances between query sequences and the closest entry in a reference database. When the accuracy of genus predictions was averaged over a representative range of identities with the reference database (100%, 99%, 97%, 95% and 90%), all tested methods had <= 50% accuracy on the currently-popular V4 region of 16S rRNA. Accuracy was found to fall rapidly with identity; for example, better methods were found to have V4 genus prediction accuracy of similar to 100% at 100% identity but similar to 50% at 97% identity. The relationship between identity and taxonomy was quantified as the probability that a rank is the lowest shared by a pair of sequences with a given pair-wise identity. With the V4 region, 95% identity was found to be a twilight zone where taxonomy is highly ambiguous because the probabilities that the lowest shared rank between pairs of sequences is genus, family, order or class are approximately equal.
引用
收藏
页数:29
相关论文
共 54 条
[1]   SPINGO: a rapid species-classifier for microbial amplicon sequences [J].
Allard, Guy ;
Ryan, Feargal J. ;
Jeffery, Ian B. ;
Claesson, Marcus J. .
BMC BIOINFORMATICS, 2015, 16
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]  
[Anonymous], 2016, BIORXIV, DOI [DOI 10.1101/074161V1, 10.1101/074161, DOI 10.1101/074161]
[4]   metaxa2: improved identification and taxonomic classification of small and large subunit rRNA in metagenomic data [J].
Bengtsson-Palme, Johan ;
Hartmann, Martin ;
Eriksson, Karl Martin ;
Pal, Chandan ;
Thorell, Kaisa ;
Larsson, Dan Goran Joakim ;
Nilsson, Rolf Henrik .
MOLECULAR ECOLOGY RESOURCES, 2015, 15 (06) :1403-1414
[5]   Trade-offs between microbiome diversity and productivity in a stratified microbial mat [J].
Bernstein, Hans C. ;
Brislawn, Colin ;
Renslow, Ryan S. ;
Dana, Karl ;
Morton, Beau ;
Lindemann, Stephen R. ;
Song, Hyun-Seob ;
Atci, Erhan ;
Beyenal, Haluk ;
Fredrickson, James K. ;
Jansson, Janet K. ;
Moran, James J. .
ISME JOURNAL, 2017, 11 (02) :405-414
[6]  
Bokulich N. A., 2017, PEERJ PREPRINTS, DOI [10.7287/peerj.preprints.3208v1, DOI 10.7287/PEERJ.PREPRINTS.3208V1]
[7]  
Brady A, 2009, NAT METHODS, V6, P673, DOI [10.1038/nmeth.1358, 10.1038/NMETH.1358]
[8]  
Callahan BJ, 2016, NAT METHODS, V13, P581, DOI [10.1038/NMETH.3869, 10.1038/nmeth.3869]
[9]   Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample [J].
Caporaso, J. Gregory ;
Lauber, Christian L. ;
Walters, William A. ;
Berg-Lyons, Donna ;
Lozupone, Catherine A. ;
Turnbaugh, Peter J. ;
Fierer, Noah ;
Knight, Rob .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 :4516-4522
[10]   QIIME allows analysis of high-throughput community sequencing data [J].
Caporaso, J. Gregory ;
Kuczynski, Justin ;
Stombaugh, Jesse ;
Bittinger, Kyle ;
Bushman, Frederic D. ;
Costello, Elizabeth K. ;
Fierer, Noah ;
Pena, Antonio Gonzalez ;
Goodrich, Julia K. ;
Gordon, Jeffrey I. ;
Huttley, Gavin A. ;
Kelley, Scott T. ;
Knights, Dan ;
Koenig, Jeremy E. ;
Ley, Ruth E. ;
Lozupone, Catherine A. ;
McDonald, Daniel ;
Muegge, Brian D. ;
Pirrung, Meg ;
Reeder, Jens ;
Sevinsky, Joel R. ;
Tumbaugh, Peter J. ;
Walters, William A. ;
Widmann, Jeremy ;
Yatsunenko, Tanya ;
Zaneveld, Jesse ;
Knight, Rob .
NATURE METHODS, 2010, 7 (05) :335-336