Bayesian evolutionary model testing in the phylogenomics era: matching model complexity with computational efficiency

被引:66
作者
Baele, Guy [1 ]
Lemey, Philippe [1 ]
机构
[1] Katholieke Univ Leuven, Rega Inst, Dept Microbiol & Immunol, B-3000 Louvain, Belgium
基金
欧洲研究理事会;
关键词
MARGINAL LIKELIHOOD ESTIMATION; PROTEIN-CODING GENES; AMINO-ACID SITES; PHYLOGENETIC ANALYSIS; POSITIVE SELECTION; STATISTICAL PHYLOGENETICS; NUCLEOTIDE SUBSTITUTION; MOLECULAR CLOCK; SEQUENCE DATA; INFERENCE;
D O I
10.1093/bioinformatics/btt340
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The advent of new sequencing technologies has led to increasing amounts of data being available to perform phylogenetic analyses, with genomic data giving rise to the field of phylogenomics. High-performance computing is becoming an indispensable research tool to fit complex evolutionary models, which take into account specific genomic properties, to large datasets. Here, we perform an extensive Bayesian phylogenetic model selection study, comparing codon and nucleotide substitution models, including codon position partitioning for nucleotide data as well gene-specific substitution models for both data types. For the best fitting partitioned models, we also compare independent partitioning with standard diffuse prior specification to conditional partitioning via hierarchical prior specification. To compare the different models, we use state-of-the-art marginal likelihood estimation techniques, including path sampling and stepping-stone sampling. Results: We show that a full codon model best describes the features of a whole mitochondrial genome dataset, consisting of 12 protein-coding genes, but only when each gene is allowed to evolve under a separate codon model. However, when using hierarchical prior specification for the partition-specific parameters instead of independent diffuse priors, codon position partitioned nucleotide models can still outperform standard codon models. We demonstrate the feasibility of fitting such a combination of complex models using the BEAGLE library for BEAST in combination with recent graphics cards. We argue that development and use of such models needs to be accompanied by state-of-the-art marginal likelihood estimators because the more traditional and computationally less demanding estimators do not offer adequate accuracy.
引用
收藏
页码:1970 / 1979
页数:10
相关论文
共 43 条
  • [1] Ayres DL, 2012, SYST BIOL, V61, P170, DOI [10.1093/sysbio/syr100, 10.1093/sysbio/sys029]
  • [2] Make the most of your samples: Bayes factor estimators for high-dimensional models of sequence evolution
    Baele, Guy
    Lemey, Philippe
    Vansteelandt, Stijn
    [J]. BMC BIOINFORMATICS, 2013, 14
  • [3] Accurate Model Selection of Relaxed Molecular Clocks in Bayesian Phylogenetics
    Baele, Guy
    Li, Wai Lok Sibon
    Drummond, Alexei J.
    Suchard, Marc A.
    Lemey, Philippe
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2013, 30 (02) : 239 - 243
  • [4] Improving the Accuracy of Demographic and Molecular Clock Model Comparison While Accommodating Phylogenetic Uncertainty
    Baele, Guy
    Lemey, Philippe
    Bedford, Trevor
    Rambaut, Andrew
    Suchard, Marc A.
    Alekseyenko, Alexander V.
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2012, 29 (09) : 2157 - 2167
  • [5] Context-dependent codon partition models provide significant increases in model fit in atpB and rbcL protein-coding genes
    Baele, Guy
    Van de Peer, Yves
    Vansteelandt, Stijn
    [J]. BMC EVOLUTIONARY BIOLOGY, 2011, 11
  • [6] Accounting for gene rate heterogeneity in phylogenetic inference
    Bevan, Rachel B.
    Bryant, David
    Lang, B. Franz
    [J]. SYSTEMATIC BIOLOGY, 2007, 56 (02) : 194 - 205
  • [7] PARTITIONING AND COMBINING DATA IN PHYLOGENETIC ANALYSIS
    BULL, JJ
    HUELSENBECK, JP
    CUNNINGHAM, CW
    SWOFFORD, DL
    WADDELL, PJ
    [J]. SYSTEMATIC BIOLOGY, 1993, 42 (03) : 384 - 397
  • [8] Graph hierarchies for phylogeography
    Cybis, Gabriela B.
    Sinsheimer, Janet S.
    Lemey, Philippe
    Suchard, Marc A.
    [J]. PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2013, 368 (1614)
  • [9] Phylogenomics and the reconstruction of the tree of life
    Delsuc, F
    Brinkmann, H
    Philippe, H
    [J]. NATURE REVIEWS GENETICS, 2005, 6 (05) : 361 - 375
  • [10] Relaxed phylogenetics and dating with confidence
    Drummond, Alexei J.
    Ho, Simon Y. W.
    Phillips, Matthew J.
    Rambaut, Andrew
    [J]. PLOS BIOLOGY, 2006, 4 (05) : 699 - 710