MTSv: rapid alignment-based taxonomic classification and high-confidence metagenomic analysis

被引:2
作者
Furstenau, Tara N. [1 ]
Schneider, Tsosie [1 ]
Shaffer, Isaac [1 ]
Vazquez, Adam J. [2 ]
Sahl, Jason [2 ]
Fofanov, Viacheslav [1 ,2 ]
机构
[1] No Arizona Univ, Sch Informat Comp & Cyber Syst, Flagstaff, AZ 86011 USA
[2] No Arizona Univ, Pathogen & Microbiome Inst, Flagstaff, AZ USA
来源
PEERJ | 2022年 / 10卷
关键词
Metagenomics; Taxonomic classification; Alignment; Pathogen detection; BACILLUS-CEREUS; READ ALIGNMENT; MICROBIOME; DATABASE; PLAGUE; GENE;
D O I
10.7717/peerj.14292
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
As the size of reference sequence databases and high-throughput sequencing datasets continue to grow, it is becoming computationally infeasible to use traditional alignment to large genome databases for taxonomic classification of metagenomic reads. Exact matching approaches can rapidly assign taxonomy and summarize the composition of microbial communities, but they sacrifice accuracy and can lead to false positives. Full alignment tools provide higher confidence assignments and can assign sequences from genomes that diverge from reference sequences; however, full alignment tools are computationally intensive. To address this, we designed MTSv specifically for alignment-based taxonomic assignment in metagenomic analysis. This tool implements an FM-index assisted q-gram filter and SIMD accelerated Smith-Waterman algorithm to find alignments. However, unlike traditional aligners, MTSv will not attempt to make additional alignments to a TaxID once an alignment of sufficient quality has been found. This improves efficiency when many reference sequences are available per taxon. MTSv was designed to be flexible and can be modified to run on either memory or processor constrained systems. Although MTSv cannot compete with the speeds of exact k-mer matching approaches, it is reasonably fast and has higher precision than popular exact matching approaches. Because MTSv performs a full alignment it can classify reads even when the genomes share low similarity with reference sequences and provides a tool for high confidence pathogen detection with low off-target assignments to near neighbor species.
引用
收藏
页数:28
相关论文
共 69 条
  • [1] Lack of Evidence for Plague or Anthrax on the New York City Subway
    Ackelsberg, Joel
    Rakeman, Jennifer
    Hughes, Scott
    Petersen, Jeannine
    Mead, Paul
    Schriefer, Martin
    Kingry, Luke
    Hoffmaster, Alex
    Gee, Jay E.
    [J]. CELL SYSTEMS, 2015, 1 (01) : 4 - 5
  • [2] Modern Methods for Delineating Metagenomic Complexity
    Afshinnekoo, Ebrahim
    Meydan, Cem
    Chowdhury, Shanin
    Jaroudi, Dyala
    Boyer, Collin
    Bernstein, Nick
    Maritz, Julia M.
    Reeves, Darryl
    Gandara, Jorge
    Chhangawala, Sagar
    Ahsanuddin, Sofia
    Simmons, Amber
    Nessel, Timothy
    Sundaresh, Bharathi
    Pereira, Elizabeth
    Jorgensen, Ellen
    Kolokotronis, Sergios-Orestis
    Kirchberger, Nell
    Garcia, Isaac
    Gandara, David
    Dhanraj, Sean
    Nawrin, Tanzina
    Saletore, Yogesh
    Alexander, Noah
    Vijay, Priyanka
    Henaff, Elizabeth M.
    Zumbo, Paul
    Walsh, Michael
    O'Mullan, Gregory D.
    Tighe, Scott
    Dudley, Joel T.
    Dunaif, Anya
    Ennis, Sean
    O'Halloran, Eoghan
    Magalhaes, Tiago R.
    Boone, Braden
    Jones, Angela L.
    Muth, Theodore R.
    Paolantonio, Katie Schneider
    Alter, Elizabeth
    Schadt, Eric E.
    Garbarino, Jeanne
    Prill, Robert J.
    Carlton, Jane M.
    Levy, Shawn
    Mason, Christopher E.
    [J]. CELL SYSTEMS, 2015, 1 (01) : 6 - 7
  • [3] Afshinnekoo E, 2015, CELL SYST, V1, P97, DOI 10.1016/j.cels.2015.07.006
  • [4] Technology dictates algorithms: recent developments in read alignment
    Alser, Mohammed
    Rotman, Jeremy
    Deshpande, Dhrithi
    Taraszka, Kodi
    Shi, Huwenbo
    Baykal, Pelin Icer
    Yang, Harry Taegyun
    Xue, Victor
    Knyazev, Sergey
    Singer, Benjamin D.
    Balliu, Brunilda
    Koslicki, David
    Skums, Pavel
    Zelikovsky, Alex
    Alkan, Can
    Mutlu, Onur
    Mangul, Serghei
    [J]. GENOME BIOLOGY, 2021, 22 (01)
  • [5] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [6] Comprehensive Wet-Bench and Bioinformatics Workflow for Complex Microbiota Using Oxford Nanopore Technologies
    Ammer-Herrmenau, Christoph
    Pfisterer, Nina
    van den Berg, Tim
    Gavrilova, Ivana
    Amanzada, Ahmad
    Singh, Shiv K.
    Khalil, Alaa
    Alili, Rohia
    Belda, Eugeni
    Clement, Karine
    El Wahed, Ahmed Abd
    Gady, ElSagad Eltayeb
    Haubrock, Martin
    Beissbarth, Tim
    Ellenrieder, Volker
    Neesse, Albrecht
    [J]. MSYSTEMS, 2021, 6 (04)
  • [7] BLAST-based validation of metagenomic sequence assignments
    Bazinet, Adam L.
    Ondov, Brian D.
    Sommer, Daniel D.
    Ratnayake, Shashikala
    [J]. PEERJ, 2018, 6
  • [8] KrakenUniq: confident and fast metagenomics classification using unique k-mer counts
    Breitwieser, F. P.
    Baker, D. N.
    Salzberg, S. L.
    [J]. GENOME BIOLOGY, 2018, 19
  • [9] A review of methods and databases for metagenomic classification and assembly
    Breitwieser, Florian P.
    Lu, Jennifer
    Salzberg, Steven L.
    [J]. BRIEFINGS IN BIOINFORMATICS, 2019, 20 (04) : 1125 - 1139
  • [10] Brown C.T., 2016, J OPEN SOURCE SOFTW, V1, P27, DOI [10.21105/joss.00027, DOI 10.21105/JOSS.00027]