Detecting Phylogenetic Signals in Eukaryotic Whole Genome Sequences

被引:9
|
作者
Cohen, Eyal [1 ]
Chor, Benny [1 ]
机构
[1] Tel Aviv Univ, Sch Comp Sci, IL-69978 Tel Aviv, Israel
关键词
alignment-free sequence comparison; average common subsequence (ACS) method; reconstructing multicellular eukaryotic phylogeny; phylogenetic signal; whole genome phylogeny; MAXIMUM-LIKELIHOOD; TREE; DATABASE; MAMMALS; DISTANCES; ALIGNMENT;
D O I
10.1089/cmb.2012.0122
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Whole genome sequences are a rich source of molecular data, with a potential for the discovery of novel evolutionary information. Yet, many parts of these sequences are not known to be under evolutionary pressure and, thus, are not conserved. Furthermore, a good model for whole genome evolution does not exist. Consequently, it is not a priori clear if a meaningful phylogenetic signal exists and can be extracted from the sequences as a whole. Indeed, very few phylogenies were reconstructed based on these sequences. Prior to this work, only two reconstruction methods were applied to large eukaryotic genomes: the K-r method (Haubold et al., 2009), which was applied to genomes of rather small diversity (Drosophila species), and the feature frequency profile method (Sims et al., 2009a), which was applied to genomes of moderate diversity (mammals). We investigate the whole genome-based phylogenetic reconstruction question with respect to a much wider taxonomic sample. We apply K-r, FFP, and an alternative alignment-free method, the average common subsequence (ACS) (Ulitsky et al., 2006), to 24 multicellular eukaryotes (vertebrates, invertebrates, and plants). We also apply ACS to the proteome sequences of these 24 taxa. We compare the resulting trees to a standard reference, the National Center for Biotechnology Information (NCBI) taxonomy tree. Trees produced by ACS(AA), based on proteomes, are in complete agreement with the NCBI tree. For the genome-based reconstruction, ACS(DNA) produces trees whose agreement with the NCBI tree is excellent to very good for divergence times up to 800 million years ago, medium at 1 billion years ago, and poor at 1.6 billion years ago. We conclude that whole genomes do carry a clear phylogenetic signal, yet this signal "saturates" with longer divergence times. Furthermore, from the few existing methods, ACS is best capable of detecting this signal.
引用
收藏
页码:945 / 956
页数:12
相关论文
共 50 条
  • [31] Whole Mitochondrial Genome Sequencing and Phylogenetic Tree Construction for Procypris mera (Lin 1933)
    Li, Zhe
    Han, Yaoquan
    Li, Yusen
    Wu, Weijun
    Lei, Jianjun
    Wang, Dapeng
    Lin, Yong
    Wang, Xiaoqing
    ANIMALS, 2024, 14 (18):
  • [32] Spirochete contributions to the eukaryotic genome
    Hall, John L.
    SYMBIOSIS, 2011, 54 (03) : 119 - 129
  • [33] Major clades and a revised classification of Magnolia and Magnoliaceae based on whole plastid genome sequences via genome skimming
    Wang, Yu-Bing
    Liu, Bin-Bin
    Nie, Ze-Long
    Chen, Hong-Feng
    Chen, Fa-Ju
    Figlar, Richard B.
    Wen, Jun
    JOURNAL OF SYSTEMATICS AND EVOLUTION, 2020, 58 (05) : 673 - 695
  • [34] Detecting Phylogenetic Breakpoints and Discordance from Genome-Wide Alignments for Species Tree Reconstruction
    Ane, Cecile
    GENOME BIOLOGY AND EVOLUTION, 2011, 3 : 246 - 258
  • [35] Single-pass classification of all noncoding sequences in a bacterial genome using phylogenetic profiles
    Marchais, Antonin
    Naville, Magali
    Bohn, Chantal
    Bouloc, Philippe
    Gautheret, Daniel
    GENOME RESEARCH, 2009, 19 (06) : 1084 - 1092
  • [36] SSR identification and phylogenetic analysis in four plant species based on complete chloroplast genome sequences
    Zhu, Yueyi
    Zhang, Xianwen
    Yan, Shufeng
    Feng, Chen
    Wang, Dongfang
    Yang, Wei
    Daud, Muhammad Khan
    Xiang, Jiqian
    Mei, Lei
    PLASMID, 2023, 125
  • [37] The complete chloroplast genome sequences of five pinnate-leaved Primula species and phylogenetic analyses
    Xu, Wenbin
    Xia, Boshun
    Li, Xinwei
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [39] Phylogenetic placement of Cynomorium in Rosales inferred from sequences of the inverted repeat region of the chloroplast genome
    Zhang, Zhi-Hong
    Li, Chun-Qi
    Li, Jianhua
    JOURNAL OF SYSTEMATICS AND EVOLUTION, 2009, 47 (04) : 297 - 304
  • [40] Integration of complete chloroplast genome sequences with small amplicon datasets improves phylogenetic resolution in Acacia
    Williams, Anna V.
    Miller, Joseph T.
    Small, Ian
    Nevill, Paul G.
    Boykin, Laura M.
    MOLECULAR PHYLOGENETICS AND EVOLUTION, 2016, 96 : 1 - 8