Metalign: efficient alignment-based metagenomic profiling via containment min hash

被引:32
作者
LaPierre, Nathan [1 ]
Alser, Mohammed [2 ]
Eskin, Eleazar [1 ,3 ,4 ]
Koslicki, David [5 ,6 ,7 ]
Mangul, Serghei [8 ]
机构
[1] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA
[2] Swiss Fed Inst Technol, Dept Comp Sci, Ramistr 101, CH-8092 Zurich, Switzerland
[3] Univ Calif Los Angeles, Dept Computat Med, Los Angeles, CA 90095 USA
[4] Univ Calif Los Angeles, Dept Human Genet, Los Angeles, CA 90095 USA
[5] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
[6] Penn State Univ, Dept Biol, University Pk, PA 16802 USA
[7] Penn State Univ, Huck Inst Life Sci, University Pk, PA 16802 USA
[8] Univ Southern Calif, Dept Clin Pharm, Los Angeles, CA 90089 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Metagenomics; Abundance estimation; Profiling; Alignment; GENOMICS;
D O I
10.1186/s13059-020-02159-0
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Metagenomic profiling, predicting the presence and relative abundances of microbes in a sample, is a critical first step in microbiome analysis. Alignment-based approaches are often considered accurate yet computationally infeasible. Here, we present a novel method, Metalign, that performs efficient and accurate alignment-based metagenomic profiling. We use a novel containment min hash approach to pre-filter the reference database prior to alignment and then process both uniquely aligned and multi-aligned reads to produce accurate abundance estimates. In performance evaluations on both real and simulated datasets, Metalign is the only method evaluated that maintained high performance and competitive running time across all datasets.
引用
收藏
页数:15
相关论文
共 34 条
  • [1] Afshinnekoo E, 2015, CELL SYST, V1, P97, DOI 10.1016/j.cels.2015.07.006
  • [2] The effect of host genetics on the gut microbiome
    Bonder, Marc Jan
    Kurilshikov, Alexander
    Tigchelaar, Ettje F.
    Mujagic, Zlatan
    Imhann, Floris
    Vila, Arnau Vich
    Deelen, Patrick
    Vatanen, Tommi
    Schirmer, Melanie
    Smeekens, Sanne P.
    Zhernakova, Dania V.
    Jankipersadsing, Soesma A.
    Jaeger, Martin
    Oosting, Marije
    Cenit, Maria Carmen
    Masclee, Ad A. M.
    Swertz, Morris A.
    Li, Yang
    Kumar, Vinod
    Joosten, Leo
    Harmsen, Hermie
    Weersma, Rinse K.
    Franke, Lude
    Hofker, Marten H.
    Xavier, Ramnik J.
    Jonkers, Daisy
    Netea, Mihai G.
    Wijmenga, Cisca
    Fu, Jingyuan
    Zhernakova, Alexandra
    [J]. NATURE GENETICS, 2016, 48 (11) : 1407 - 1412
  • [3] Fast and sensitive protein alignment using DIAMOND
    Buchfink, Benjamin
    Xie, Chao
    Huson, Daniel H.
    [J]. NATURE METHODS, 2015, 12 (01) : 59 - 60
  • [4] Bushnell B., 2014, BBMAP FAST ACCURATE
  • [5] The metagenomics of soil
    Daniel, R
    [J]. NATURE REVIEWS MICROBIOLOGY, 2005, 3 (06) : 470 - 478
  • [6] Microbial community genomics in the ocean
    DeLong, EE
    [J]. NATURE REVIEWS MICROBIOLOGY, 2005, 3 (06) : 459 - 469
  • [7] Accurate read-based metagenome characterization using a hierarchical suite of unique signatures
    Freitas, Tracey Allen K.
    Li, Po-E
    Scholz, Matthew B.
    Chain, Patrick S. G.
    [J]. NUCLEIC ACIDS RESEARCH, 2015, 43 (10)
  • [8] PhyloPythiaS plus : a self-training method for the rapid reconstruction of low-ranking taxonomic bins from metagenomes
    Gregor, Ivan
    Droege, Johannes
    Schirmer, Melanie
    Quince, Christopher
    McHardy, Alice C.
    [J]. PEERJ, 2016, 4
  • [9] Metagenomics: Application of genomics to uncultured microorganisms
    Handelsman, J
    [J]. MICROBIOLOGY AND MOLECULAR BIOLOGY REVIEWS, 2004, 68 (04) : 669 - +
  • [10] MEGAN analysis of metagenomic data
    Huson, Daniel H.
    Auch, Alexander F.
    Qi, Ji
    Schuster, Stephan C.
    [J]. GENOME RESEARCH, 2007, 17 (03) : 377 - 386