19 Dubious Ways to Compute the Marginal Likelihood of a Phylogenetic Tree Topology

被引:36
作者
Fourment, Mathieu [1 ]
Magee, Andrew F. [2 ]
Whidden, Chris [3 ]
Bilge, Arman [3 ]
Matsen, Frederick A. [3 ]
Minin, Vladimir N. [4 ]
机构
[1] Univ Technol Sydney, Ithree Inst, Ultimo, NSW 2007, Australia
[2] Univ Washington, Dept Biol, Seattle, WA 98195 USA
[3] Fred Hutchinson Canc Res Ctr, Seattle, WA 98109 USA
[4] Univ Calif Irvine, Dept Stat, Irvine, CA 92697 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
Bayesian inference; evidence; importance sampling; model selection; variational Bayes; BAYESIAN-INFERENCE; MODEL; APPROXIMATIONS; PROPOSALS; SAMPLER; PRIORS;
D O I
10.1093/sysbio/syz046
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The marginal likelihood of a model is a key quantity for assessing the evidence provided by the data in support of a model. The marginal likelihood is the normalizing constant for the posterior density, obtained by integrating the product of the likelihood and the prior with respect to model parameters. Thus, the computational burden of computing the marginal likelihood scales with the dimension of the parameter space. In phylogenetics, where we work with tree topologies that are high-dimensional models, standard approaches to computing marginal likelihoods are very slow. Here, we study methods to quickly compute the marginal likelihood of a single fixed tree topology. We benchmark the speed and accuracy of 19 different methods to compute the marginal likelihood of phylogenetic topologies on a suite of real data sets under the JC69 model. These methods include several new ones that we develop explicitly to solve this problem, as well as existing algorithms that we apply to phylogenetic models for the first time. Altogether, our results show that the accuracy of these methods varies widely, and that accuracy does not necessarily correlate with computational burden. Our newly developed methods are orders of magnitude faster than standard approaches, and in some cases, their accuracy rivals the best established estimators.
引用
收藏
页码:209 / 220
页数:12
相关论文
共 8 条
  • [1] A topology-marginal composite likelihood via a generalized phylogenetic pruning algorithm
    Jun, Seong-Hwan
    Nasif, Hassan
    Jennings-Shaffer, Chris
    Rich, David H.
    Kooperberg, Anna
    Fourment, Mathieu
    Zhang, Cheng
    Suchard, Marc A.
    Matsen, Frederick A.
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2023, 18 (01)
  • [2] Guided Tree Topology Proposals for Bayesian Phylogenetic Inference
    Hohna, Sebastian
    Drummond, Alexei J.
    SYSTEMATIC BIOLOGY, 2012, 61 (01) : 1 - 11
  • [3] ParBaum: A fast program for phylogenetic tree inference with maximum likelihood
    Stamatakis, AP
    Ludwig, T
    Meier, H
    High Performance Computing in Science and Engineering, Garching 2004, 2005, : 275 - 284
  • [4] Improving Marginal Likelihood Estimation for Bayesian Phylogenetic Model Selection
    Xie, Wangang
    Lewis, Paul O.
    Fan, Yu
    Kuo, Lynn
    Chen, Ming-Hui
    SYSTEMATIC BIOLOGY, 2011, 60 (02) : 150 - 160
  • [5] The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection
    Yu, Yun
    Degnan, James H.
    Nakhleh, Luay
    PLOS GENETICS, 2012, 8 (04): : 456 - 465
  • [6] Systematic Exploration of the High Likelihood Set of Phylogenetic Tree Topologies
    Whidden, Chris
    Claywell, Brian C.
    Fisher, Thayer
    Magee, Andrew F.
    Fourment, Mathieu
    Matsen, Frederick A.
    SYSTEMATIC BIOLOGY, 2020, 69 (02) : 280 - 293
  • [7] The Limits of the Constant-rate Birth-Death Prior for Phylogenetic Tree Topology Inference
    Khurana, Mark P.
    Scheidwasser-Clow, Neil
    Penn, Matthew J.
    Bhatt, Samir
    Duchene, David A.
    SYSTEMATIC BIOLOGY, 2024, 73 (01) : 235 - 246
  • [8] Long-Branch Attraction in Species Tree Estimation: Inconsistency of Partitioned Likelihood and Topology-Based Summary Methods
    Roch, Sebastien
    Nute, Michael
    Warnow, Tandy
    SYSTEMATIC BIOLOGY, 2019, 68 (02) : 281 - 297