Bayesian phylogenetic model selection using reversible jump Markov chain Monte Carlo

被引:434
作者
Huelsenbeck, JP [1 ]
Larget, B
Alfaro, ME
机构
[1] Univ Calif San Diego, Div Biol Sci, Sect Ecol Behav & Evolut, San Diego, CA 92103 USA
[2] Univ Wisconsin, Dept Bot, Madison, WI 53706 USA
[3] Univ Wisconsin, Dept Stat, Madison, WI 53706 USA
关键词
Bayesian phylogenetic inference; Markov chain Monte Carlo; maximum likelihood; reversible jump Markov chain Monte Carlo; substitution models;
D O I
10.1093/molbev/msh123
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A common problem in molecular phylogenetics is choosing a model of DNA substitution that does a good job of explaining the DNA sequence alignment without introducing superfluous parameters. A number of methods have been used to choose among a small set of candidate substitution models, such as the likelihood ratio test, the Akaike Information Criterion (AIC), tire Bayesian Information Criterion (BIC), and Bayes factors. Current implementations of any of these criteria suffer from the limitation that only a small set of models are examined, or that the test does not allow easy comparison of non-nested models. In this article, we expand the pool of candidate substitution models to include all possible time-reversible models. This set includes seven models that have already been described. We show how Bayes factors can be calculated for these models using reversible jump Markov chain Monte Carlo, and apply the method to 16 DNA sequence alignments. For each data set, we compare the model with the best Bayes factor to the best models chosen using AIC and BIC. We find that the best model under any of these criteria is not necessarily the most complicated one; models with an intermediate number of substitution types typically do best. Moreover, almost all of the models that are chosen as best do not constrain a transition rate to be the same as a transversion rate, suggesting that it is the transition/transversion rate bias that plays the largest role in determining which models are selected. Importantly, the reversible jump Markov chain Monte Carlo algorithm described here allows estimation of phylogeny (and other phylogenetic model parameters) to be performed while accounting for uncertainty in the model of DNA substitution.
引用
收藏
页码:1123 / 1133
页数:11
相关论文
共 52 条
[1]   PHYLOGENY OF WHALES - DEPENDENCE OF THE INFERENCE ON SPECIES SAMPLING [J].
ADACHI, J ;
HASEGAWA, M .
MOLECULAR BIOLOGY AND EVOLUTION, 1995, 12 (01) :177-179
[2]  
Akaike H., 1973, 2 INT S INFORM THEOR, P267, DOI [DOI 10.1007/978-1-4612-1694-0_15, 10.1007/978-1-4612-1694-0_15]
[3]   Molecular systematics and evolution of Regina and the Thamnophiine snakes [J].
Alfaro, ME ;
Arnold, SJ .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2001, 21 (03) :408-423
[4]   Phylogenetic analyses of mitochondrial DNA suggest a sister group relationship between Xenarthra (Edentata) and ferungulates [J].
Arnason, U ;
Gullberg, A ;
Janke, A .
MOLECULAR BIOLOGY AND EVOLUTION, 1997, 14 (07) :762-768
[5]   Perspectives on archaeal diversity, thermophily and monophyly from environmental rRNA sequences [J].
Barns, SM ;
Delwiche, CF ;
Palmer, JD ;
Pace, NR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1996, 93 (17) :9188-9193
[6]  
Bell ET., 1934, AM MATH MONTHLY, V41, P411, DOI DOI 10.1080/00029890.1934.11987615
[7]   Phylogenetic relationships among the Nymphalidae (Lepidoptera) inferred from partial sequences of the wingless gene [J].
Brower, AVZ .
PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2000, 267 (1449) :1201-1211
[8]  
COX DR, 1962, J ROY STAT SOC B, V24, P406
[9]  
FELSENSTEIN J, 1984, EVOLUTION, V38, P16, DOI 10.1111/j.1558-5646.1984.tb00255.x
[10]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376