ngsLCA-A toolkit for fast and flexible lowest common ancestor inference and taxonomic profiling of metagenomic data

被引:20
作者
Wang, Yucheng [1 ,2 ,3 ,4 ]
Korneliussen, Thorfinn Sand [2 ]
Holman, Luke E. [5 ,6 ]
Manica, Andrea [1 ]
Pedersen, Mikkel Winther [2 ]
机构
[1] Univ Cambridge, Dept Zool, Cambridge, England
[2] Univ Copenhagen, Lundbeck Fdn, Globe Inst, GeoGenet Ctr, Copenhagen K, Denmark
[3] Chinese Acad Sci, Inst Tibetan Plateau Res ITPCAS, State Key Lab Tibetan Plateau Earth Syst Environm, ALPHA, Beijing, Peoples R China
[4] BGI Shenzhen, BGI, Shanghai, Peoples R China
[5] Univ Southampton, Natl Oceanog Ctr Southampton, Sch Ocean & Earth Sci, Southampton, Hants, England
[6] Univ Copenhagen, Fac Hlth & Med Sci, Globe Inst, Sect Evolutionary Genom, Copenhagen, Denmark
来源
METHODS IN ECOLOGY AND EVOLUTION | 2022年 / 13卷 / 12期
关键词
environmental DNA (eDNA); lowest common ancestor (LCA); metagenomics; next-generation sequencing; sedimentary ancient DNA (sedaDNA); shotgun sequencing; taxonomic profiling; toolkit; ENVIRONMENTAL DNA; READ ALIGNMENT; VEGETATION; VIABILITY; SEQUENCE;
D O I
10.1111/2041-210X.14006
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Metagenomic data generated from environmental samples is increasingly common in the analysis of modern and ancient biological communities. To obtain taxonomic profiles from this type of data, DNA sequences are aligned against large genomic reference databases and the lowest common ancestor (LCA) needs to be inferred for each sequence with multiple alignments. To date, efforts have mainly focused on improving the speed, sensitivity and specificity of alignment tools, and little effort has been applied to the LCA algorithm that generates the taxonomic profiles from alignments. We present ngsLCA, a command-line toolkit with two separate modules: the main program (in C/C++) performing LCA inference, and an R package for generating tables and visualisations of the taxonomic profiles. ngsLCA processed large datasets in BAM/SAM alignment format 4-11 times faster and used less memory compared to other available programs. It is compatible with the NCBI taxonomy and has flexible parameter settings. Furthermore, the toolkit offers functions for filtering, contamination removal, taxonomic clustering, and multiple ways of visualising the generated taxonomic profiles. ngsLCA bridges a gap in current metagenomic analyses by supplying a computationally light, easy-to-use, accurate, fast and flexible LCA algorithm with R functions for processing and illustrating the taxonomic profiles
引用
收藏
页码:2699 / 2708
页数:10
相关论文
共 36 条
[1]   PuffAligner: a fast, efficient and accurate aligner based on the Pufferfish index [J].
Almodaresi, Fatemeh ;
Zakeri, Mohsen ;
Patro, Rob .
BIOINFORMATICS, 2021, 37 (22) :4048-4055
[2]   Bacterial viability and culturability [J].
Barer, MR ;
Harwood, CR .
ADVANCES IN MICROBIAL PHYSIOLOGY, VOL 41, 1999, 41 :93-137
[3]   Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3 [J].
Beghini, Francesco ;
McIver, Lauren J. ;
Blanco-Miguez, Aitor ;
Dubois, Leonard ;
Asnicar, Francesco ;
Maharjan, Sagun ;
Mailyan, Ana ;
Manghi, Paolo ;
Scholz, Matthias ;
Thomas, Andrew Maltez ;
Valles-Colomer, Mireia ;
Weingart, George ;
Zhang, Yancong ;
Zolfo, Moreno ;
Huttenhower, Curtis ;
Franzosa, Eric A. ;
Segata, Nicola .
ELIFE, 2021, 10
[4]   Comparing whole-genome shotgun sequencing and DNA metabarcoding approaches for species identification and quantification of pollen species mixtures [J].
Bell, Karen L. ;
Petit, Robert A., III ;
Cutler, Anya ;
Dobbs, Emily K. ;
Macpherson, J. Michael ;
Read, Timothy D. ;
Burgess, Kevin S. ;
Brosi, Berry J. .
ECOLOGY AND EVOLUTION, 2021, 11 (22) :16082-16098
[5]  
Borry, 2022, J OPEN SOURCE SOFTW, V7, P4360, DOI [10.21105/joss.04360, DOI 10.21105/JOSS.04360]
[6]   KrakenUniq: confident and fast metagenomics classification using unique k-mer counts [J].
Breitwieser, F. P. ;
Baker, D. N. ;
Salzberg, S. L. .
GENOME BIOLOGY, 2018, 19
[7]   Sensitive protein alignments at tree-of-life scale using DIAMOND [J].
Buchfink, Benjamin ;
Reuter, Klaus ;
Drost, Hajk-Georg .
NATURE METHODS, 2021, 18 (04) :366-+
[8]   Metagenomics: A viable tool for reconstructing herbivore diet [J].
Chua, Physilia Y. S. ;
Crampton-Platt, Alex ;
Lammers, Youri ;
Alsos, Inger G. ;
Boessenkool, Sanne ;
Bohmann, Kristine .
MOLECULAR ECOLOGY RESOURCES, 2021, 21 (07) :2249-2263
[9]   Metagenomic sequencing of environmental DNA reveals marine faunal assemblages from the West Antarctic Peninsula [J].
Cowart, Dominique A. ;
Murphy, Katherine R. ;
Cheng, C. -H. Christina .
MARINE GENOMICS, 2018, 37 :148-160
[10]   PIA: More Accurate Taxonomic Assignment of Metagenomic Data Demonstrated on sedaDNA From the North Sea [J].
Cribdon, Becky ;
Ware, Roselyn ;
Smith, Oliver ;
Gaffney, Vincent ;
Allaby, Robin G. .
FRONTIERS IN ECOLOGY AND EVOLUTION, 2020, 8