Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains

被引:60
作者
Lan, Yemin [2 ]
Rosen, Gail [3 ]
Hershberg, Ruth [1 ]
机构
[1] Technion Israel Inst Technol, Ruth & Bruce Rappaport Fac Med, Dept Genet & Dev Biol, Rachel & Menachem Mendelovitch Evolutionary Proc, IL-31096 Haifa, Israel
[2] Drexel Univ, Sch Biomed Engn Sci & Hlth Syst, 3141 Chestnut St, Philadelphia, PA 19104 USA
[3] Drexel Univ, Dept Elect & Comp Engn, Ecol & Evolutionary Signal Proc & Informat Lab, 3141 Chestnut St, Philadelphia, PA 19104 USA
基金
美国国家科学基金会;
关键词
Less-conserved genes; Lineage-specific; Marker genes; Genome-wide similarity; 16S RIBOSOMAL-RNA; DATABASE; DIVERSITY; BACTERIAL; METAGENOMICS; RECONSTRUCTION; IDENTIFICATION; MICROBIOME; ACCURATE;
D O I
10.1186/s40168-016-0162-5
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Background: The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. Results: In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that the percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. Conclusions: Our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages.
引用
收藏
页数:13
相关论文
共 59 条
[1]   Photosynthetic and phylogenetic primers for detection of anoxygenic phototrophs in natural environments [J].
Achenbach, LA ;
Carey, J ;
Madigan, MT .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2001, 67 (07) :2922-2926
[2]   Enterotypes of the human gut microbiome [J].
Arumugam, Manimozhiyan ;
Raes, Jeroen ;
Pelletier, Eric ;
Le Paslier, Denis ;
Yamada, Takuji ;
Mende, Daniel R. ;
Fernandes, Gabriel R. ;
Tap, Julien ;
Bruls, Thomas ;
Batto, Jean-Michel ;
Bertalan, Marcelo ;
Borruel, Natalia ;
Casellas, Francesc ;
Fernandez, Leyden ;
Gautier, Laurent ;
Hansen, Torben ;
Hattori, Masahira ;
Hayashi, Tetsuya ;
Kleerebezem, Michiel ;
Kurokawa, Ken ;
Leclerc, Marion ;
Levenez, Florence ;
Manichanh, Chaysavanh ;
Nielsen, H. Bjorn ;
Nielsen, Trine ;
Pons, Nicolas ;
Poulain, Julie ;
Qin, Junjie ;
Sicheritz-Ponten, Thomas ;
Tims, Sebastian ;
Torrents, David ;
Ugarte, Edgardo ;
Zoetendal, Erwin G. ;
Wang, Jun ;
Guarner, Francisco ;
Pedersen, Oluf ;
de Vos, Willem M. ;
Brunak, Soren ;
Dore, Joel ;
Weissenbach, Jean ;
Ehrlich, S. Dusko ;
Bork, Peer .
NATURE, 2011, 473 (7346) :174-180
[3]   GenBank [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D34-D38
[4]  
Bragg L, 2014, METHODS MOL BIOL, V1096, P183, DOI 10.1007/978-1-62703-712-9_15
[5]   Use of 16S rRNA and rpoB genes as molecular markers for microbial ecology studies [J].
Case, Rebecca J. ;
Boucher, Yan ;
Dahllof, Ingela ;
Holmstrom, Carola ;
Doolittle, W. Ford ;
Kjelleberg, Staffan .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2007, 73 (01) :278-288
[6]   Bioinformatics for whole-genome shotgun sequencing of microbial communities [J].
Chen, K ;
Pachter, L .
PLOS COMPUTATIONAL BIOLOGY, 2005, 1 (02) :106-112
[7]   Toward automatic reconstruction of a highly resolved tree of life [J].
Ciccarelli, FD ;
Doerks, T ;
von Mering, C ;
Creevey, CJ ;
Snel, B ;
Bork, P .
SCIENCE, 2006, 311 (5765) :1283-1287
[8]   The Ribosomal Database Project: improved alignments and new tools for rRNA analysis [J].
Cole, J. R. ;
Wang, Q. ;
Cardenas, E. ;
Fish, J. ;
Chai, B. ;
Farris, R. J. ;
Kulam-Syed-Mohideen, A. S. ;
McGarrell, D. M. ;
Marsh, T. ;
Garrity, G. M. ;
Tiedje, J. M. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D141-D145
[9]   PhyloSift: phylogenetic analysis of genomes and metagenomes [J].
Darling, Aaron E. ;
Jospin, Guillaume ;
Lowe, Eric ;
Matsen, Frederick A., IV ;
Bik, Holly M. ;
Eisen, Jonathan A. .
PEERJ, 2014, 2
[10]   Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB [J].
DeSantis, T. Z. ;
Hugenholtz, P. ;
Larsen, N. ;
Rojas, M. ;
Brodie, E. L. ;
Keller, K. ;
Huber, T. ;
Dalevi, D. ;
Hu, P. ;
Andersen, G. L. .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2006, 72 (07) :5069-5072