MIMt: a curated 16S rRNA reference database with less redundancy and higher accuracy at species-level identification

被引:0
作者
Cabezas, M. Pilar [1 ,2 ]
Fonseca, Nuno A. [3 ,4 ]
Munoz-Merida, Antonio [3 ,4 ]
机构
[1] Univ Minho, Ctr Mol & Environm Biol CBMA, Dept Biol, Campus Gualtar, P-4710057 Braga, Portugal
[2] Univ Minho, Inst Sci & Innovat Biosustainabil IB S, Campus Gualtar, P-4710057 Braga, Portugal
[3] CIBIO InBIO, Res Ctr Biodivers & Genet Resources, P-4485661 Vairao, Portugal
[4] CIBIO, BIOPOLIS Program Genom Biodivers & Land Planning, Campus Vairao, P-4485661 Vairao, Portugal
基金
欧盟地平线“2020”;
关键词
GENE DATABASE; BACTERIAL; SILVA; ASSIGNMENT; GREENGENES; CONSISTENT; TAXONOMY;
D O I
10.1186/s40793-024-00634-w
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
MotivationAccurate determination and quantification of the taxonomic composition of microbial communities, especially at the species level, is one of the major issues in metagenomics. This is primarily due to the limitations of commonly used 16S rRNA reference databases, which either contain a lot of redundancy or a high percentage of sequences with missing taxonomic information. This may lead to erroneous identifications and, thus, to inaccurate conclusions regarding the ecological role and importance of those microorganisms in the ecosystem.ResultsThe current study presents MIMt, a new 16S rRNA database for archaea and bacteria's identification, encompassing 47 001 sequences, all precisely identified at species level. In addition, a MIMt2.0 version was created with only curated sequences from RefSeq Targeted loci with 32 086 sequences. MIMt aims to be updated twice a year to include all newly sequenced species. We evaluated MIMt against Greengenes, RDP, GTDB and SILVA in terms of sequence distribution and taxonomic assignments accuracy. Our results showed that MIMt contains less redundancy, and despite being 20 to 500 times smaller than existing databases, outperforms them in completeness and taxonomic accuracy, enabling more precise assignments at lower taxonomic ranks and thus, significantly improving species-level identification.
引用
收藏
页数:13
相关论文
共 44 条
  • [21] Comparison of microbiota in the cloaca, colon, and magnum of layer chicken
    Lee, Seo-Jin
    Cho, Seongwoo
    La, Tae-Min
    Lee, Hong-Jae
    Lee, Joong-Bok
    Park, Seung-Yong
    Song, Chang-Seon
    Choi, In-Soo
    Lee, Sang-Won
    [J]. PLOS ONE, 2020, 15 (08):
  • [22] The RDP-II (Ribosomal Database Project)
    Maidak, BL
    Cole, JR
    Lilburn, TG
    Parker, CT
    Saxman, PR
    Farris, RJ
    Garrity, GM
    Olsen, GJ
    Schmidt, TM
    Tiedje, JM
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (01) : 173 - 174
  • [23] McDonald D, 2024, NAT BIOTECHNOL, V42, P715, DOI 10.1038/s41587-023-01845-1
  • [24] An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea
    McDonald, Daniel
    Price, Morgan N.
    Goodrich, Julia
    Nawrocki, Eric P.
    DeSantis, Todd Z.
    Probst, Alexander
    Andersen, Gary L.
    Knight, Rob
    Hugenholtz, Philip
    [J]. ISME JOURNAL, 2012, 6 (03) : 610 - 618
  • [25] Park Sang-Cheol, 2018, Genomics & Informatics, V16, pe24, DOI 10.5808/GI.2018.16.4.e24
  • [26] GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy
    Parks, Donovan H.
    Chuvochina, Maria
    Rinke, Christian
    Mussig, Aaron J.
    Chaumeil, Pierre-Alain
    Hugenholtz, Philip
    [J]. NUCLEIC ACIDS RESEARCH, 2022, 50 (D1) : D785 - D794
  • [27] A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life
    Parks, Donovan H.
    Chuvochina, Maria
    Waite, David W.
    Rinke, Christian
    Skarshewski, Adam
    Chaumeil, Pierre-Alain
    Hugenholtz, Philip
    [J]. NATURE BIOTECHNOLOGY, 2018, 36 (10) : 996 - +
  • [28] LPSN-list of prokaryotic names with standing in nomenclature
    Parte, Aidan C.
    [J]. NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) : D613 - D616
  • [29] Microbial Diversity in Extreme Marine Habitats and Their Biomolecules
    Poli, Annarita
    Finore, Ilaria
    Romano, Ida
    Gioiello, Alessia
    Lama, Licia
    Nicolaus, Barbara
    [J]. MICROORGANISMS, 2017, 5 (02)
  • [30] SILVA:: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB
    Pruesse, Elmar
    Quast, Christian
    Knittel, Katrin
    Fuchs, Bernhard M.
    Ludwig, Wolfgang
    Peplies, Joerg
    Gloeckner, Frank Oliver
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 (21) : 7188 - 7196