TaxonKit: A practical and efficient NCBI taxonomy toolkit

被引:147
作者
Shen, Wei [1 ]
Ren, Hong [1 ]
机构
[1] Chongqing Med Univ, Affiliated Hosp 2, Inst Viral Hepatitis, Key Lab Mol Biol Infect Dis,Minist Educ,Dept Infe, Chongqing 400010, Peoples R China
基金
中国国家自然科学基金;
关键词
NCBI Taxonomy; TaxonKit; TaxId; Lineage; TaxId changelog;
D O I
10.1016/j.jgg.2021.03.006
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The National Center for Biotechnology Information (NCBI) Taxonomy is widely applied in biomedical and ecological studies. Typical demands include querying taxonomy identifier (TaxIds) by taxonomy names, querying complete taxonomic lineages by TaxIds, listing descendants of given TaxIds, and others. However, existed tools are either limited in functionalities or inefficient in terms of runtime. In this work, we present TaxonKit, a command-line toolkit for comprehensive and efficient manipulation of NCBI Taxonomy data. TaxonKit comprises seven core subcommands providing functions, including TaxIds querying, listing, filtering, lineage retrieving and reformatting, lowest common ancestor computation, and TaxIds change tracking. The practical functions, competitive processing performance, scalability with different scales of datasets and good accessibility can facilitate taxonomy data manipulations. TaxonKit provides free access under the permissive MIT license on GitHub, Brewsci, and Bioconda. Copyright (C) 2021, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Limited and Science Press. All rights reserved.
引用
收藏
页码:844 / 850
页数:7
相关论文
共 26 条
  • [1] BLAST plus : architecture and applications
    Camacho, Christiam
    Coulouris, George
    Avagyan, Vahram
    Ma, Ning
    Papadopoulos, Jason
    Bealer, Kevin
    Madden, Thomas L.
    [J]. BMC BIOINFORMATICS, 2009, 10
  • [2] Camargo A.P, 2020, TAXOPY PYTHON PACKAG
  • [3] Chamberlain Scott A, 2013, F1000Res, V2, P191, DOI 10.12688/f1000research.2-191.v1
  • [4] CeMbio-TheCaenorhabditis elegansMicrobiome Resource
    Dirksen, Philipp
    Assie, Adrien
    Zimmermann, Johannes
    Zhang, Fan
    Tietje, Adina-Malin
    Marsh, Sarah Arnaud
    Felix, Marie-Anne
    Shapira, Michael
    Kaleta, Christoph
    Schulenburg, Hinrich
    Samuel, Buck S.
    [J]. G3-GENES GENOMES GENETICS, 2020, 10 (09): : 3025 - 3039
  • [5] Bioconda: sustainable and comprehensive software distribution for the life sciences
    Gruening, Bjoern
    Dale, Ryan
    Sjoedin, Andreas
    Chapman, Brad A.
    Rowe, Jillian
    Tomkins-Tinch, Christopher H.
    Valieris, Renan
    Koester, Johannes
    Team, Bioconda
    [J]. NATURE METHODS, 2018, 15 (07) : 475 - 476
  • [6] ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data
    Huerta-Cepas, Jaime
    Serra, Francois
    Bork, Peer
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2016, 33 (06) : 1635 - 1638
  • [7] MEGAN analysis of metagenomic data
    Huson, Daniel H.
    Auch, Alexander F.
    Qi, Ji
    Schuster, Stephan C.
    [J]. GENOME RESEARCH, 2007, 17 (03) : 377 - 386
  • [8] Kuczynski Justin, 2012, Curr Protoc Microbiol, VChapter 1, DOI [10.1002/0471250953.bi1007s36, 10.1002/9780471729259.mc01e05s27]
  • [9] GToTree: a user-friendly workflow for phylogenomics
    Lee, Michael D.
    [J]. BIOINFORMATICS, 2019, 35 (20) : 4162 - 4164
  • [10] GenBank is a reliable resource for 21st century biodiversity research
    Leray, Matthieu
    Knowlton, Nancy
    Ho, Shian-Lei
    Nguyen, Bryan N.
    Machida, Ryuji J.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2019, 116 (45) : 22651 - 22656