Computational Performance and Statistical Accuracy of *BEAST and Comparisons with Other Methods

被引:77
作者
Ogilvie, Huw A. [1 ]
Heled, Joseph [2 ,3 ]
Xie, Dong [2 ,3 ]
Drummond, Alexei J. [2 ,3 ]
机构
[1] Australian Natl Univ, Res Sch Biol, Evolut Ecol & Genet, Canberra, ACT, Australia
[2] Univ Auckland, Dept Comp Sci, Auckland 1, New Zealand
[3] Univ Auckland, Allan Wilson Ctr Mol Ecol & Evolut, Auckland 1, New Zealand
基金
澳大利亚研究理事会;
关键词
Bayesian phylogenetics; Concatenation; Gene tree; Multispecies coalescent; Phylogenomics; Species tree; Supermatrix; SPECIES TREE ESTIMATION; GENE TREES; PHYLOGENETIC ANALYSIS; MAXIMUM-LIKELIHOOD; COALESCENT; HYBRIDIZATION; DISCOVERY; INFERENCE; RECONSTRUCTION; CONCATENATION;
D O I
10.1093/sysbio/syv118
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Under the multispecies coalescent model of molecular evolution, gene trees have independent evolutionary histories within a shared species tree. In comparison, supermatrix concatenation methods assume that gene trees share a single common genealogical history, thereby equating gene coalescence with species divergence. The multispecies coalescent is supported by previous studies which found that its predicted distributions fit empirical data, and that concatenation is not a consistent estimator of the species tree. *BEAST, a fully Bayesian implementation of the multispecies coalescent, is popular but computationally intensive, so the increasing size of phylogenetic data sets is both a computational challenge and an opportunity for better systematics. Using simulation studies, we characterize the scaling behavior of *BEAST, and enable quantitative prediction of the impact increasing the number of loci has on both computational performance and statistical accuracy. Follow-up simulations over awide range of parameters show that the statistical performance of *BEAST relative to concatenation improves both as branch length is reduced and as the number of loci is increased. Finally, using simulations based on estimated parameters from two phylogenomic data sets, we compare the performance of a range of species tree and concatenation methods to show that using *BEAST with tens of loci can be preferable to using concatenation with thousands of loci. Our results provide insight into the practicalities of Bayesian species tree estimation, the number of loci required to obtain a given level of accuracy and the situations in which supermatrix or summary methods will be outperformed by the fully Bayesian multispecies coalescent.
引用
收藏
页码:381 / 396
页数:16
相关论文
共 67 条
[41]   Estimating a binary character's effect on speciation and extinction [J].
Maddison, Wayne P. ;
Midford, Peter E. ;
Otto, Sarah P. .
SYSTEMATIC BIOLOGY, 2007, 56 (05) :701-710
[42]   Gene trees in species trees [J].
Maddison, WP .
SYSTEMATIC BIOLOGY, 1997, 46 (03) :523-536
[43]   Target-enrichment strategies for next-generation sequencing [J].
Mamanova, Lira ;
Coffey, Alison J. ;
Scott, Carol E. ;
Kozarewa, Iwanka ;
Turner, Emily H. ;
Kumar, Akash ;
Howard, Eleanor ;
Shendure, Jay ;
Turner, Daniel J. .
NATURE METHODS, 2010, 7 (02) :111-118
[44]   Applications of next-generation sequencing to phylogeography and phylogenetics [J].
McCormack, John E. ;
Hird, Sarah M. ;
Zellmer, Amanda J. ;
Carstens, Bryan C. ;
Brumfield, Robb T. .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2013, 66 (02) :526-538
[45]   ASTRAL: genome-scale coalescent-based species tree estimation [J].
Mirarab, S. ;
Reaz, R. ;
Bayzid, Md. S. ;
Zimmermann, T. ;
Swenson, M. S. ;
Warnow, T. .
BIOINFORMATICS, 2014, 30 (17) :I541-I548
[46]   ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes [J].
Mirarab, Siavash ;
Warnow, Tandy .
BIOINFORMATICS, 2015, 31 (12) :44-52
[47]   Evaluating Summary Methods for Multilocus Species Tree Estimation in the Presence of Incomplete Lineage Sorting [J].
Mirarab, Siavash ;
Bayzid, Md Shamsuzzoha ;
Warnow, Tandy .
SYSTEMATIC BIOLOGY, 2016, 65 (03) :366-380
[48]   EXTINCTION RATES CAN BE ESTIMATED FROM MOLECULAR PHYLOGENIES [J].
NEE, S ;
HOLMES, EC ;
MAY, RM ;
HARVEY, PH .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY OF LONDON SERIES B-BIOLOGICAL SCIENCES, 1994, 344 (1307) :77-82
[49]   From gene to organismal phylogeny: Reconciled trees and the gene tree species tree problem [J].
Page, RDM ;
Charleston, MA .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 1997, 7 (02) :231-240
[50]  
PAMILO P, 1988, MOL BIOL EVOL, V5, P568