Detecting Phylogenetic Signals in Eukaryotic Whole Genome Sequences

被引:9
|
作者
Cohen, Eyal [1 ]
Chor, Benny [1 ]
机构
[1] Tel Aviv Univ, Sch Comp Sci, IL-69978 Tel Aviv, Israel
关键词
alignment-free sequence comparison; average common subsequence (ACS) method; reconstructing multicellular eukaryotic phylogeny; phylogenetic signal; whole genome phylogeny; MAXIMUM-LIKELIHOOD; TREE; DATABASE; MAMMALS; DISTANCES; ALIGNMENT;
D O I
10.1089/cmb.2012.0122
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Whole genome sequences are a rich source of molecular data, with a potential for the discovery of novel evolutionary information. Yet, many parts of these sequences are not known to be under evolutionary pressure and, thus, are not conserved. Furthermore, a good model for whole genome evolution does not exist. Consequently, it is not a priori clear if a meaningful phylogenetic signal exists and can be extracted from the sequences as a whole. Indeed, very few phylogenies were reconstructed based on these sequences. Prior to this work, only two reconstruction methods were applied to large eukaryotic genomes: the K-r method (Haubold et al., 2009), which was applied to genomes of rather small diversity (Drosophila species), and the feature frequency profile method (Sims et al., 2009a), which was applied to genomes of moderate diversity (mammals). We investigate the whole genome-based phylogenetic reconstruction question with respect to a much wider taxonomic sample. We apply K-r, FFP, and an alternative alignment-free method, the average common subsequence (ACS) (Ulitsky et al., 2006), to 24 multicellular eukaryotes (vertebrates, invertebrates, and plants). We also apply ACS to the proteome sequences of these 24 taxa. We compare the resulting trees to a standard reference, the National Center for Biotechnology Information (NCBI) taxonomy tree. Trees produced by ACS(AA), based on proteomes, are in complete agreement with the NCBI tree. For the genome-based reconstruction, ACS(DNA) produces trees whose agreement with the NCBI tree is excellent to very good for divergence times up to 800 million years ago, medium at 1 billion years ago, and poor at 1.6 billion years ago. We conclude that whole genomes do carry a clear phylogenetic signal, yet this signal "saturates" with longer divergence times. Furthermore, from the few existing methods, ACS is best capable of detecting this signal.
引用
收藏
页码:945 / 956
页数:12
相关论文
共 50 条
  • [1] Phylogenetic analyses of phylum Actinobacteria based on whole genome sequences
    Verma, Mansi
    Lal, Devi
    Kaur, Jaspreet
    Saxena, Anjali
    Kaur, Jasvinder
    Anand, Shailly
    Lal, Rup
    RESEARCH IN MICROBIOLOGY, 2013, 164 (07) : 718 - 728
  • [2] Construction of a Phylogenetic Tree of Photosynthetic Prokaryotes Based on Average Similarities of Whole Genome Sequences
    Satoh, Soichirou
    Mimuro, Mamoru
    Tanaka, Ayumi
    PLOS ONE, 2013, 8 (07):
  • [3] The phylogeny of the BEP clade in grasses revisited: Evidence from the whole-genome sequences of chloroplasts
    Wu, Zhi-Qiang
    Ge, Song
    MOLECULAR PHYLOGENETICS AND EVOLUTION, 2012, 62 (01) : 573 - 578
  • [4] Differential distribution of simple sequence repeats in eukaryotic genome sequences
    Katti, MV
    Ranjekar, PK
    Gupta, VS
    MOLECULAR BIOLOGY AND EVOLUTION, 2001, 18 (07) : 1161 - 1167
  • [5] Whole Genome Sequences of Four Brucella Strains
    Ding, Jiabo
    Pan, Yuanlong
    Jiang, Hai
    Cheng, Junsheng
    Liu, Taotao
    Qin, Nan
    Yang, Yi
    Cui, Buyun
    Chen, Chen
    Liu, Cuihua
    Mao, Kairong
    Zhu, Baoli
    JOURNAL OF BACTERIOLOGY, 2011, 193 (14) : 3674 - 3675
  • [6] Phylogenomics from Whole Genome Sequences Using aTRAM
    Allen, Julie M.
    Boyd, Bret
    Nam-Phuong Nguyen
    Vachaspati, Pranjal
    Warnow, Tandy
    Huang, Daisie I.
    Grady, Patrick G. S.
    Bell, Kayce C.
    Cronk, Quentin C. B.
    Mugisha, Lawrence
    Pittendrigh, Barry R.
    Soledad Leonardi, M.
    Reed, David L.
    Johnson, Kevin P.
    SYSTEMATIC BIOLOGY, 2017, 66 (05) : 786 - 798
  • [7] E value cutoff and eukaryotic genome content phylogenetics
    Rosenfeld, Jeffrey A.
    DeSalle, Rob
    MOLECULAR PHYLOGENETICS AND EVOLUTION, 2012, 63 (02) : 342 - 350
  • [8] Complete chloroplast genome sequences of five Bruguiera species (Rhizophoraceae): comparative analysis and phylogenetic relationships
    Ruang-areerate, Panthita
    Kongkachana, Wasitthee
    Naktang, Chaiwat
    Sonthirod, Chutima
    Narong, Nattapol
    Jomchai, Nukoon
    Maprasop, Pasin
    Maknual, Chatree
    Phormsin, Nawin
    Shearman, Jeremy R.
    Pootakham, Wirulda
    Tangphatsornruang, Sithichoke
    PEERJ, 2021, 9
  • [9] Selenoprofiles: profile-based scanning of eukaryotic genome sequences for selenoprotein genes
    Mariotti, M.
    Guigo, R.
    BIOINFORMATICS, 2010, 26 (21) : 2656 - 2663
  • [10] Evolutionary Divergence between Toona ciliata and Toona sinensis Assayed with Their Whole Genome Sequences
    Wang, Xi
    Xiao, Yu
    He, Zi-Han
    Li, Ling-Ling
    Lv, Yan-Wen
    Hu, Xin-Sheng
    GENES, 2022, 13 (10)