Benchmarking Metagenomics Tools for Taxonomic Classification

被引:319
作者
Ye, Simon H. [1 ,2 ]
Siddle, Katherine J. [2 ,3 ]
Park, Daniel J. [2 ]
Sabeti, Pardis C. [2 ,3 ,4 ,5 ]
机构
[1] MIT, Harvard Hlth Sci & Technol, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] Broad Inst MIT & Harvard, Cambridge, MA 02142 USA
[3] Harvard Univ, Dept Organismal & Evolutionary Biol, Ctr Syst Biol, Cambridge, MA 02138 USA
[4] Harvard Sch Publ Hlth, Dept Immunol & Infect Dis, Boston, MA 02115 USA
[5] HHMI, Chevy Chase, MD 20815 USA
关键词
ALIGNMENT; SURVEILLANCE; SEQUENCES; ABUNDANCE;
D O I
10.1016/j.cell.2019.07.010
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Metagenomic sequencing is revolutionizing the detection and characterization of microbial species, and a wide variety of software tools are available to perform taxonomic classification of these data. The fast pace of development of these tools and the complexity of metagenomic data make it important that researchers are able to benchmark their performance. Here, we review current approaches for metagenomic analysis and evaluate the performance of 20 metagenomic classifiers using simulated and experimental datasets. We describe the key metrics used to assess performance, offer a framework for the comparison of additional classifiers, and discuss the future of metagenomic data analysis.
引用
收藏
页码:779 / 794
页数:16
相关论文
共 82 条
[21]   Opportunistic data structures with applications [J].
Ferragina, P ;
Manzini, G .
41ST ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2000, :390-398
[22]   A human gut bacterial genome and culture collection for improved metagenomic analyses [J].
Forster, Samuel C. ;
Kumar, Nitin ;
Anonye, Blessing O. ;
Almeida, Alexandre ;
Viciani, Elisa ;
Stares, Mark D. ;
Dunn, Matthew ;
Mkandawire, Tapoka T. ;
Zhu, Ana ;
Shao, Yan ;
Pike, Lindsay J. ;
Louie, Thomas ;
Browne, Hilary P. ;
Mitchell, Alex L. ;
Neville, B. Anne ;
Finn, Robert D. ;
Lawley, Trevor D. .
NATURE BIOTECHNOLOGY, 2019, 37 (02) :186-+
[23]   Accurate read-based metagenome characterization using a hierarchical suite of unique signatures [J].
Freitas, Tracey Allen K. ;
Li, Po-E ;
Scholz, Matthew B. ;
Chain, Patrick S. G. .
NUCLEIC ACIDS RESEARCH, 2015, 43 (10)
[24]   Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak [J].
Gire, Stephen K. ;
Goba, Augustine ;
Andersen, Kristian G. ;
Sealfon, Rachel S. G. ;
Park, Daniel J. ;
Kanneh, Lansana ;
Jalloh, Simbirie ;
Momoh, Mambu ;
Fullah, Mohamed ;
Dudas, Gytis ;
Wohl, Shirlee ;
Moses, Lina M. ;
Yozwiak, Nathan L. ;
Winnicki, Sarah ;
Matranga, Christian B. ;
Malboeuf, Christine M. ;
Qu, James ;
Gladden, Adrianne D. ;
Schaffner, Stephen F. ;
Yang, Xiao ;
Jiang, Pan-Pan ;
Nekoui, Mahan ;
Colubri, Andres ;
Coomber, Moinya Ruth ;
Fonnie, Mbalu ;
Moigboi, Alex ;
Gbakie, Michael ;
Kamara, Fatima K. ;
Tucker, Veronica ;
Konuwa, Edwin ;
Saffa, Sidiki ;
Sellu, Josephine ;
Jalloh, Abdul Azziz ;
Kovoma, Alice ;
Koninga, James ;
Mustapha, Ibrahim ;
Kargbo, Kandeh ;
Foday, Momoh ;
Yillah, Mohamed ;
Kanneh, Franklyn ;
Robert, Willie ;
Massally, James L. B. ;
Chapman, Sinead B. ;
Bochicchio, James ;
Murphy, Cheryl ;
Nusbaum, Chad ;
Young, Sarah ;
Birren, BruceW. ;
Grant, Donald S. ;
Scheiffelin, John S. .
SCIENCE, 2014, 345 (6202) :1369-1372
[25]   Bioconda: sustainable and comprehensive software distribution for the life sciences [J].
Gruening, Bjoern ;
Dale, Ryan ;
Sjoedin, Andreas ;
Chapman, Brad A. ;
Rowe, Jillian ;
Tomkins-Tinch, Christopher H. ;
Valieris, Renan ;
Koester, Johannes ;
Team, Bioconda .
NATURE METHODS, 2018, 15 (07) :475-476
[26]   Depletion of Abundant Sequences by Hybridization (DASH): using Cas9 to remove unwanted high-abundance species in sequencing libraries and molecular counting applications [J].
Gu, W. ;
Crawford, E. D. ;
O'Donovan, B. D. ;
Wilson, M. R. ;
Chow, E. D. ;
Retallack, H. ;
DeRisi, J. L. .
GENOME BIOLOGY, 2016, 17
[27]   Matplotlib: A 2D graphics environment [J].
Hunter, John D. .
COMPUTING IN SCIENCE & ENGINEERING, 2007, 9 (03) :90-95
[28]   EnSVMB: Metagenomics Fragments Classification using Ensemble SVM and BLAST [J].
Jiang, Yuan ;
Wang, Jun ;
Xia, Dawen ;
Yu, Guoxian .
SCIENTIFIC REPORTS, 2017, 7
[29]  
Jones Eric, 2001, SciPy: Open source scientific tools for Python
[30]   Viral Diagnostics in Plants Using Next Generation Sequencing: Computational Analysis in Practice [J].
Jones, Susan ;
Baizan-Edge, Amanda ;
MacFarlane, Stuart ;
Torrance, Lesley .
FRONTIERS IN PLANT SCIENCE, 2017, 8