Informational Gene Phylogenies Do Not Support a Fourth Domain of Life for Nucleocytoplasmic Large DNA Viruses

被引:61
作者
Williams, Tom A. [1 ]
Embley, T. Martin [1 ]
Heinz, Eva [1 ]
机构
[1] Newcastle Univ, Inst Cell & Mol Biosci, Newcastle Upon Tyne NE1 7RU, Tyne & Wear, England
关键词
MULTIPLE SEQUENCE ALIGNMENT; PROTEIN EVOLUTION; MIXTURE-MODELS; ORIGIN; TREE; RECONSTRUCTION; HYDROGENOSOMES; MICROSPORIDIA; SELECTION; PROVIDES;
D O I
10.1371/journal.pone.0021080
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Mimivirus is a nucleocytoplasmic large DNA virus (NCLDV) with a genome size (1.2 Mb) and coding capacity (> 1000 genes) comparable to that of some cellular organisms. Unlike other viruses, Mimivirus and its NCLDV relatives encode homologs of broadly conserved informational genes found in Bacteria, Archaea, and Eukaryotes, raising the possibility that they could be placed on the tree of life. A recent phylogenetic analysis of these genes showed the NCLDVs emerging as a monophyletic group branching between Eukaryotes and Archaea. These trees were interpreted as evidence for an independent "fourth domain" of life that may have contributed DNA processing genes to the ancestral eukaryote. However, the analysis of ancient evolutionary events is challenging, and tree reconstruction is susceptible to bias resulting from non-phylogenetic signals in the data. These include compositional heterogeneity and homoplasy, which can lead to the spurious grouping of compositionally-similar or fast-evolving sequences. Here, we show that these informational gene alignments contain both significant compositional heterogeneity and homoplasy, which were not adequately modelled in the original analysis. When we use more realistic evolutionary models that better fit the data, the resulting trees are unable to reject a simple null hypothesis in which these informational genes, like many other NCLDV genes, were acquired by horizontal transfer from eukaryotic hosts. Our results suggest that a fourth domain is not required to explain the available sequence data.
引用
收藏
页数:11
相关论文
共 55 条
[1]   ProtTest: selection of best-fit models of protein evolution [J].
Abascal, F ;
Zardoya, R ;
Posada, D .
BIOINFORMATICS, 2005, 21 (09) :2104-2105
[2]   A site- and time-heterogeneous model of amino acid replacement [J].
Blanquart, Samuel ;
Lartillot, Nicolas .
MOLECULAR BIOLOGY AND EVOLUTION, 2008, 25 (05) :842-858
[3]   Bayesian model adequacy and choice in phylogenetics [J].
Bollback, JP .
MOLECULAR BIOLOGY AND EVOLUTION, 2002, 19 (07) :1171-1180
[4]   Phylogenetic and Phyletic Studies of Informational Genes in Genomes Highlight Existence of a 4th Domain of Life Including Giant Viruses [J].
Boyer, Mickael ;
Madoui, Mohammed-Amine ;
Gimenez, Gregory ;
La Scola, Bernard ;
Raoult, Didier .
PLOS ONE, 2010, 5 (12)
[5]   Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis [J].
Castresana, J .
MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (04) :540-552
[6]   Toward automatic reconstruction of a highly resolved tree of life [J].
Ciccarelli, FD ;
Doerks, T ;
von Mering, C ;
Creevey, CJ ;
Snel, B ;
Bork, P .
SCIENCE, 2006, 311 (5765) :1283-1287
[7]   The archaebacterial origin of eukaryotes [J].
Cox, Cymon J. ;
Foster, Peter G. ;
Hirt, Robert P. ;
Harris, Simon R. ;
Embley, T. Martin .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (51) :20356-20361
[8]   The tree of one percent [J].
Dagan, Tal ;
Martin, William .
GENOME BIOLOGY, 2006, 7 (10)
[9]  
Dayhoff M O., 1978, Atlas of Protein Seq Struct, ppp 345
[10]   MUSCLE: multiple sequence alignment with high accuracy and high throughput [J].
Edgar, RC .
NUCLEIC ACIDS RESEARCH, 2004, 32 (05) :1792-1797