Taxonomy based performance metrics for evaluating taxonomic assignment methods

被引:6
作者
Chen, Chung-Yen [1 ]
Tang, Sen-Lin [2 ]
Chou, Seng-Cho T. [1 ]
机构
[1] Natl Taiwan Univ, Dept Informat Management, Taipei 106, Taiwan
[2] Acad Sinica, Biodivers Res Ctr, Taipei 115, Taiwan
关键词
Metagenomics; Classification; Performance evaluation; Data analysis; RNA GENE DATABASE;
D O I
10.1186/s12859-019-2896-0
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundMetagenomics experiments often make inferences about microbial communities by sequencing 16S and 18S rRNA, and taxonomic assignment is a fundamental step in such studies. This paper addresses the weaknesses in two types of metrics commonly used by previous studies for measuring the performance of existing taxonomic assignment methods: Sequence count based metrics and Binary error measurement. These metrics made performance evaluation results biased, less informative and mutually incomparable.ResultsWe investigated weaknesses in two types of metrics and proposed new performance metrics including Average Taxonomy Distance (ATD) and ATD_by_Taxa, together with the visualized ATD plot.ConclusionsBy comparing the evaluation results from four popular taxonomic assignment methods across three test data sets, we found the new metrics more robust, informative and comparable.
引用
收藏
页数:11
相关论文
共 21 条
[1]  
[Anonymous], METAGENOMICS
[2]   SILVA, RDP, Greengenes, NCBI and OTT - how do these taxonomies compare? [J].
Balvociute, Monika ;
Huson, Daniel H. .
BMC GENOMICS, 2017, 18
[3]   Microbial Malaise: How Can We Classify the Microbiome? [J].
Beiko, Robert G. .
TRENDS IN MICROBIOLOGY, 2015, 23 (11) :671-679
[4]   16S Classifier: A Tool for Fast and Accurate Taxonomic Classification of 16S rRNA Hypervariable Regions in Metagenomic Datasets [J].
Chaudhary, Nikhil ;
Sharma, Ashok K. ;
Agarwal, Piyush ;
Gupta, Ankit ;
Sharma, Vineet K. .
PLOS ONE, 2015, 10 (02)
[5]   Ribosomal Database Project: data and tools for high throughput rRNA analysis [J].
Cole, James R. ;
Wang, Qiong ;
Fish, Jordan A. ;
Chai, Benli ;
McGarrell, Donna M. ;
Sun, Yanni ;
Brown, C. Titus ;
Porras-Alfaro, Andrea ;
Kuske, Cheryl R. ;
Tiedje, James M. .
NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) :D633-D642
[6]   Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB [J].
DeSantis, T. Z. ;
Hugenholtz, P. ;
Larsen, N. ;
Rojas, M. ;
Brodie, E. L. ;
Keller, K. ;
Huber, T. ;
Dalevi, D. ;
Hu, P. ;
Andersen, G. L. .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2006, 72 (07) :5069-5072
[7]   TACOA - Taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach [J].
Diaz, Naryttza N. ;
Krause, Lutz ;
Goesmann, Alexander ;
Niehaus, Karsten ;
Nattkemper, Tim W. .
BMC BIOINFORMATICS, 2009, 10
[8]   Taxator-tk: precise taxonomic assignment of metagenomes by fast approximation of evolutionary neighborhoods [J].
Droege, J. ;
Gregor, I. ;
McHardy, A. C. .
BIOINFORMATICS, 2015, 31 (06) :817-824
[9]  
Edgar R. C., 2016, SINTAX SIMPLE NONBAY, V2016, DOI [DOI 10.1101/074161, 10.1101/074161]
[10]  
Faloutsos C, 2012, MOR KAUF D, P279