BASTA - Taxonomic classification of sequences and sequence bins using last common ancestor estimations

被引:67
作者
Kahlke, Tim [1 ]
Ralph, Peter J. [1 ]
机构
[1] Univ Technol Sydney, Climate Change Cluster, Ultimo, NSW, Australia
来源
METHODS IN ECOLOGY AND EVOLUTION | 2019年 / 10卷 / 01期
关键词
bioinformatics; genomics; metagenome bins; metagenomics; taxonomic classification;
D O I
10.1111/2041-210X.13095
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Identification of the taxonomic origin of a DNA sequence is crucial for many sequencing projects, e.g. metagenomics studies, identification of contaminations in whole genome sequencing projects and filtering of organisms of interest in marker-gene based community analyses. Last common ancestor algorithms are powerful approaches to estimate the taxonomy of a given sequence and have been widely used for classification of next-generation sequencing (NGS) reads, also known as 2nd generation sequencing reads. Here, we present BASTA (), a basic sequence taxonomy annotator, which extends last common ancestor estimations from sequencing reads to any kind of nucleotide or amino acid sequence utilizing NCBI taxonomies of user-defined best hits. BASTA can be configured to use the output of many common sequence comparison tools, e.g. BLAST and Diamond, in conjunction with either provided or user-defined target sequence databases.
引用
收藏
页码:100 / 103
页数:4
相关论文
共 14 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   UniProt: the universal protein knowledgebase [J].
Bateman, Alex ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alpi, Emanuele ;
Antunes, Ricardo ;
Bely, Benoit ;
Bingley, Mark ;
Bonilla, Carlos ;
Britto, Ramona ;
Bursteinas, Borisas ;
Bye-A-Jee, Hema ;
Cowley, Andrew ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dogan, Tunca ;
Fazzini, Francesco ;
Castro, Leyla Garcia ;
Figueira, Luis ;
Garmiri, Penelope ;
Georghiou, George ;
Gonzalez, Daniel ;
Hatton-Ellis, Emma ;
Li, Weizhong ;
Liu, Wudong ;
Lopez, Rodrigo ;
Luo, Jie ;
Lussi, Yvonne ;
MacDougall, Alistair ;
Nightingale, Andrew ;
Palka, Barbara ;
Pichler, Klemens ;
Poggioli, Diego ;
Pundir, Sangya ;
Pureza, Luis ;
Qi, Guoying ;
Rosanoff, Steven ;
Saidi, Rabie ;
Sawford, Tony ;
Shypitsyna, Aleksandra ;
Speretta, Elena ;
Turner, Edward ;
Tyagi, Nidhi ;
Volynkin, Vladimir ;
Wardell, Tony ;
Warner, Kate ;
Watkins, Xavier ;
Zaru, Rossana ;
Zellner, Hermann ;
Xenarios, Ioannis .
NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) :D158-D169
[3]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[4]   Fast and sensitive protein alignment using DIAMOND [J].
Buchfink, Benjamin ;
Xie, Chao ;
Huson, Daniel H. .
NATURE METHODS, 2015, 12 (01) :59-60
[5]   QIIME allows analysis of high-throughput community sequencing data [J].
Caporaso, J. Gregory ;
Kuczynski, Justin ;
Stombaugh, Jesse ;
Bittinger, Kyle ;
Bushman, Frederic D. ;
Costello, Elizabeth K. ;
Fierer, Noah ;
Pena, Antonio Gonzalez ;
Goodrich, Julia K. ;
Gordon, Jeffrey I. ;
Huttley, Gavin A. ;
Kelley, Scott T. ;
Knights, Dan ;
Koenig, Jeremy E. ;
Ley, Ruth E. ;
Lozupone, Catherine A. ;
McDonald, Daniel ;
Muegge, Brian D. ;
Pirrung, Meg ;
Reeder, Jens ;
Sevinsky, Joel R. ;
Tumbaugh, Peter J. ;
Walters, William A. ;
Widmann, Jeremy ;
Yatsunenko, Tanya ;
Zaneveld, Jesse ;
Knight, Rob .
NATURE METHODS, 2010, 7 (05) :335-336
[6]   PhyloSift: phylogenetic analysis of genomes and metagenomes [J].
Darling, Aaron E. ;
Jospin, Guillaume ;
Lowe, Eric ;
Matsen, Frederick A., IV ;
Bik, Holly M. ;
Eisen, Jonathan A. .
PEERJ, 2014, 2
[7]   MEGAN analysis of metagenomic data [J].
Huson, Daniel H. ;
Auch, Alexander F. ;
Qi, Ji ;
Schuster, Stephan C. .
GENOME RESEARCH, 2007, 17 (03) :377-386
[8]  
Kahlke T., 2017, SYMBIODINIUM ITS2 AM, DOI [10.17605/osf.io/hcsp4, DOI 10.17605/OSF.IO/HCSP4]
[9]  
Kahlke T., 2018, TIMKAHLKE BASTA PUBL, DOI [10.5281/zenodo.1413751, DOI 10.5281/ZENODO.1413751]
[10]   Comprehensive benchmarking and ensemble approaches for metagenomic classifiers [J].
McIntyre, Alexa B. R. ;
Ounit, Rachid ;
Afshinnekoo, Ebrahim ;
Prill, Robert J. ;
Henaff, Elizabeth ;
Alexander, Noah ;
Minot, Samuel S. ;
Danko, David ;
Foox, Jonathan ;
Ahsanuddin, Sofia ;
Tighe, Scott ;
Hasan, Nur A. ;
Subramanian, Poorani ;
Moffat, Kelly ;
Levy, Shawn ;
Lonardi, Stefano ;
Greenfield, Nick ;
Colwell, Rita R. ;
Rosen, Gail L. ;
Mason, Christopher E. .
GENOME BIOLOGY, 2017, 18