Polymorphism-Aware Species Trees with Advanced Mutation Models, Bootstrap, and Rate Heterogeneity

被引:21
作者
Schrempf, Dominik [1 ,2 ,3 ]
Bui Quang Minh [3 ,4 ,5 ]
von Haeseler, Arndt [3 ,6 ]
Kosiol, Carolin [2 ,7 ]
机构
[1] Eotvos Lorand Univ, Dept Biol Phys, Budapest, Hungary
[2] Univ St Andrews, Ctr Biol Divers, St Andrews, Fife, Scotland
[3] Med Univ Vienna, Univ Vienna, Max F Perutz Labs, Ctr Integrat Bioinformat Vienna, Vienna, Austria
[4] Australian Natl Univ, Res Sch Biol, Ecol & Evolut, Canberra, ACT, Australia
[5] Australian Natl Univ, Res Sch Comp Sci, Canberra, ACT, Australia
[6] Univ Vienna, Fac Comp Sci, Bioinformat & Computat Biol, Vienna, Austria
[7] Vetmeduni Vienna, Inst Populat Genet, Vienna, Austria
基金
奥地利科学基金会; 欧洲研究理事会; 英国惠康基金;
关键词
incomplete lineage sorting; species tree; phylogenetics; polymorphism-aware phylogenetic model; boundary mutation model; PHYLOGENETIC MODELS; DNA-SEQUENCES; GENE TREES; SIMULATION; EVOLUTION; SELECTION; ALGORITHMS; DRIFT;
D O I
10.1093/molbev/msz043
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Molecular phylogenetics has neglected polymorphisms within present and ancestral populations for a long time. Recently, multispecies coalescent based methods have increased in popularity, however, their application is limited to a small number of species and individuals. We introduced a polymorphism-aware phylogenetic model (PoMo), which overcomes this limitation and scales well with the increasing amount of sequence data whereas accounting for present and ancestral polymorphisms. PoMo circumvents handling of gene trees and directly infers species trees from allele frequency data. Here, we extend the PoMo implementation in IQ-TREE and integrate search for the statistically best-fit mutation model, the ability to infer mutation rate variation across sites, and assessment of branch support values. We exemplify an analysis of a hundred species with ten haploid individuals each, showing that PoMo can perform inference on large data sets. While PoMo is more accurate than standard substitution models applied to concatenated alignments, it is almost as fast. We also provide bmm-simulate, a software package that allows simulation of sequences evolving under PoMo. The new options consolidate the value of PoMo for phylogenetic analyses with population data.
引用
收藏
页码:1294 / 1301
页数:8
相关论文
共 41 条
  • [1] Akaike H., 1971, Second International Symposium on Information Theory, P267
  • [2] [Anonymous], 1958, Mathematical Proceedings of the Cambridge Philosophical Society, DOI [DOI 10.1017/S0305004100033193, 10.1017/S0305004100033193]
  • [3] PoMo: An Allele Frequency-Based Approach for Species Tree Estimation
    De Maio, Nicola
    Schrempf, Dominik
    Kosiol, Carolin
    [J]. SYSTEMATIC BIOLOGY, 2015, 64 (06) : 1018 - 1031
  • [4] Anomalous Unrooted Gene Trees
    Degnan, James H.
    [J]. SYSTEMATIC BIOLOGY, 2013, 62 (04) : 574 - 590
  • [5] Gene tree discordance, phylogenetic inference and the multispecies coalescent
    Degnan, James H.
    Rosenberg, Noah A.
    [J]. TRENDS IN ECOLOGY & EVOLUTION, 2009, 24 (06) : 332 - 340
  • [6] UFBoot2: Improving the Ultrafast Bootstrap Approximation
    Diep Thi Hoang
    Chernomor, Olga
    von Haeseler, Arndt
    Minh, Bui Quang
    Le Sy Vinh
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2018, 35 (02) : 518 - 522
  • [7] BEAST: Bayesian evolutionary analysis by sampling trees
    Drummond, Alexei J.
    Rambaut, Andrew
    [J]. BMC EVOLUTIONARY BIOLOGY, 2007, 7 (1)
  • [8] 1977 RIETZ LECTURE - BOOTSTRAP METHODS - ANOTHER LOOK AT THE JACKKNIFE
    EFRON, B
    [J]. ANNALS OF STATISTICS, 1979, 7 (01) : 1 - 26
  • [9] FELSENSTEIN J, 1985, EVOLUTION, V39, P783, DOI 10.1111/j.1558-5646.1985.tb00420.x
  • [10] Inferring species phylogenies from multiple genes: Concatenated sequence tree versus consensus gene tree
    Gadagkar, SR
    Rosenberg, MS
    Kumar, S
    [J]. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION, 2005, 304B (01) : 64 - 74