Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, and Interpreting Novel, Deep Branches in Marker Gene Phylogenetic Trees

被引:62
作者
Wu, Dongying [1 ]
Wu, Martin [1 ,4 ]
Halpern, Aaron [2 ,3 ]
Rusch, Douglas B. [2 ,3 ]
Yooseph, Shibu [2 ,3 ]
Frazier, Marvin [2 ,3 ]
Venter, J. Craig [2 ,3 ]
Eisen, Jonathan A. [1 ]
机构
[1] Univ Calif Davis, Univ Calif Davis Genome Ctr, Dept Med Microbiol & Immunol, Dept Ecol & Evolut, Davis, CA 95616 USA
[2] J Craig Venter Inst, Rockville, MD USA
[3] J Craig Venter Inst, La Jolla, CA USA
[4] Univ Virginia, Charlottesville, VA USA
基金
美国国家科学基金会;
关键词
16S RIBOSOMAL-RNA; MULTIPLE SEQUENCE ALIGNMENT; MICROBIAL DIVERSITY; GENOME; EVOLUTION; PROTEIN; IDENTIFICATION; PHOTOTROPHY; ALGORITHM; RESOURCE;
D O I
10.1371/journal.pone.0018011
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Most of our knowledge about the ancient evolutionary history of organisms has been derived from data associated with specific known organisms (i.e., organisms that we can study directly such as plants, metazoans, and culturable microbes). Recently, however, a new source of data for such studies has arrived: DNA sequence data generated directly from environmental samples. Such metagenomic data has enormous potential in a variety of areas including, as we argue here, in studies of very early events in the evolution of gene families and of species. Methodology/Principal Findings: We designed and implemented new methods for analyzing metagenomic data and used them to search the Global Ocean Sampling (GOS) Expedition data set for novel lineages in three gene families commonly used in phylogenetic studies of known and unknown organisms: small subunit rRNA and the recA and rpoB superfamilies. Though the methods available could not accurately identify very deeply branched ss-rRNAs (largely due to difficulties in making robust sequence alignments for novel rRNA fragments), our analysis revealed the existence of multiple novel branches in the recA and rpoB gene families. Analysis of available sequence data likely from the same genomes as these novel recA and rpoB homologs was then used to further characterize the possible organismal source of the novel sequences. Conclusions/Significance: Of the novel recA and rpoB homologs identified in the metagenomic data, some likely come from uncharacterized viruses while others may represent ancient paralogs not yet seen in any cultured organism. A third possibility is that some come from novel cellular lineages that are only distantly related to any organisms for which sequence data is currently available. If there exist any major, but so-far-undiscovered, deeply branching lineages in the tree of life, we suggest that methods such as those described herein currently offer the best way to search for them.
引用
收藏
页数:12
相关论文
共 80 条
[21]   MUSCLE: a multiple sequence alignment method with reduced time and space complexity [J].
Edgar, RC .
BMC BIOINFORMATICS, 2004, 5 (1) :1-19
[22]   MUSCLE: multiple sequence alignment with high accuracy and high throughput [J].
Edgar, RC .
NUCLEIC ACIDS RESEARCH, 2004, 32 (05) :1792-1797
[23]   Viral metagenomics [J].
Edwards, RA ;
Rohwer, F .
NATURE REVIEWS MICROBIOLOGY, 2005, 3 (06) :504-510
[24]   Assessing evolutionary relationships among microbes from whole-genome analysis [J].
Eisen, JA .
CURRENT OPINION IN MICROBIOLOGY, 2000, 3 (05) :475-480
[25]   A phylogenomic study of DNA repair genes, proteins, and processes [J].
Eisen, JA ;
Hanawalt, PC .
MUTATION RESEARCH-DNA REPAIR, 1999, 435 (03) :171-213
[26]   Phylogenomics: Improving functional predictions for uncharacterized genes by evolutionary analysis [J].
Eisen, JA .
GENOME RESEARCH, 1998, 8 (03) :163-167
[27]  
Eisen JA, 1995, J MOL EVOL, V41, P1105
[28]   Horizontal gene transfer among microbial genomes: new insights from complete genome analysis [J].
Eisen, JA .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 2000, 10 (06) :606-611
[29]   An efficient algorithm for large-scale detection of protein families [J].
Enright, AJ ;
Van Dongen, S ;
Ouzounis, CA .
NUCLEIC ACIDS RESEARCH, 2002, 30 (07) :1575-1584
[30]   THE PHYLOGENY OF PROKARYOTES [J].
FOX, GE ;
STACKEBRANDT, E ;
HESPELL, RB ;
GIBSON, J ;
MANILOFF, J ;
DYER, TA ;
WOLFE, RS ;
BALCH, WE ;
TANNER, RS ;
MAGRUM, LJ ;
ZABLEN, LB ;
BLAKEMORE, R ;
GUPTA, R ;
BONEN, L ;
LEWIS, BJ ;
STAHL, DA ;
LUEHRSEN, KR ;
CHEN, KN ;
WOESE, CR .
SCIENCE, 1980, 209 (4455) :457-463