Implementing and testing the multispecies coalescent model: A valuable paradigm for phylogenomics

被引:277
作者
Edwards, Scott V. [1 ]
Xi, Zhenxiang [1 ]
Janke, Axel [2 ]
Faircloth, Brant C. [3 ,4 ]
McCormack, John E. [5 ]
Glenn, Travis C. [6 ]
Zhong, Bojian [7 ]
Wu, Shaoyuan [8 ,9 ]
Lemmon, Emily Moriarty [10 ]
Lemmon, Alan R. [11 ]
Leache, Adam D. [12 ,13 ]
Liu, Liang [14 ,15 ]
Davis, Charles C. [1 ]
机构
[1] Harvard Univ, Dept Organism & Evolutionary Biol, Cambridge, MA 02138 USA
[2] Senckenberg Gesell Nat Forsch, Senckenberg Biodivers & Climate Res Ctr, D-60325 Frankfurt, Germany
[3] Louisiana State Univ, Dept Biol Sci, Baton Rouge, LA 70803 USA
[4] Louisiana State Univ, Museum Nat Sci, Baton Rouge, LA 70803 USA
[5] Occidental Coll, Moore Lab Zool, Los Angeles, CA 90041 USA
[6] Univ Georgia, Dept Environm Hlth Sci, Athens, GA 30602 USA
[7] Nanjing Normal Univ, Coll Life Sci, Nanjing 210023, Jiangsu, Peoples R China
[8] Tianjin Med Univ, Sch Basic Med Sci, Dept Biochem & Mol Biol, Tianjin 300070, Peoples R China
[9] Tianjin Med Univ, Sch Basic Med Sci, Tianjin Key Lab Med Epigenet, Tianjin 300070, Peoples R China
[10] Florida State Univ, Dept Biol Sci, Tallahassee, FL 32306 USA
[11] Florida State Univ, Dept Comp Sci, Tallahassee, FL 32306 USA
[12] Univ Washington, Dept Biol, Seattle, WA 98195 USA
[13] Univ Washington, Burke Museum Nat Hist & Culture, Seattle, WA 98195 USA
[14] Univ Georgia, Dept Stat, Athens, GA 30602 USA
[15] Univ Georgia, Inst Bioinformat, Athens, GA 30602 USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
SPECIES-TREE ESTIMATION; PLACENTAL MAMMAL PHYLOGENY; LAND PLANT ORIGINS; LIKELY GENE TREES; MAXIMUM-LIKELIHOOD; DNA-SEQUENCES; SISTER GROUP; EVOLUTIONARY RELATIONSHIPS; HIDDEN SUPPORT; GENOMIC DATA;
D O I
10.1016/j.ympev.2015.10.027
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In recent articles published in Molecular Phylogenetics and Evolution, Mark Springer and John Gatesy (S&G) present numerous criticisms of recent implementations and testing of the multispecies coalescent (MSC) model in phylogenomics, popularly known as "species tree" methods. After pointing out errors in alignments and gene tree rooting in recent phylogenomic data sets, particularly in Song et al. (2012) on mammals and Xi et al. (2014) on plants, they suggest that these errors seriously compromise the conclusions of these studies. Additionally, S&G enumerate numerous perceived violated assumptions and deficiencies in the application of the MSC model in phylogenomics, such as its assumption of neutrality and in particular the use of transcriptomes, which are deemed inappropriate for the MSC because the constituent exons often subtend large regions of chromosomes within which recombination is substantial. We acknowledge these previously reported errors in recent phylogenomic data sets, but disapprove of S&G's excessively combative and taunting tone. We show that these errors, as well as two nucleotide sorting methods used in the analysis of Arrtborella, have little impact on the conclusions of those papers. Moreover, several concepts introduced by S&G and an appeal to "first principles" of phylogenetics in an attempt to discredit MSC models are invalid and reveal numerous misunderstandings of the MSC. Contrary to the claims of S&G we show that recent computer simulations used to test the robustness of MSC models are not circular and do not unfairly favor MSC models over concatenation. In fact, although both concatenation and MSC models clearly perform well in regions of tree space with long branches and little incomplete lineage sorting (ILS), simulations reveal the erratic behavior of concatenation when subjected to data subsampling and its tendency to produce spuriously confident yet conflicting results in regions of parameter space where MSC models still perform well. S&G's claims that MSC models explain little or none (0-15%) of the observed gene tree heterogeneity observed in a mammal data set and that MSC models assume ILS as the only source of gene tree variation are flawed. Overall many of their criticisms of MSC models are invalidated when concatenation is appropriately viewed as a special case of the MSC, which in turn is a special case of emerging network models in phylogenomics. We reiterate that there is enormous promise and value in recent implementations and tests of the MSC and look forward to its increased use and refinement in phylogenomics. (C) 2015 The Authors. Published by Elsevier Inc.
引用
收藏
页码:447 / 462
页数:16
相关论文
共 119 条
[1]   Missing the forest for the trees:: Phylogenetic compression and its implications for inferring complex evolutionary histories [J].
Ané, C ;
Sanderson, MJ .
SYSTEMATIC BIOLOGY, 2005, 54 (01) :146-157
[2]   Detecting Phylogenetic Breakpoints and Discordance from Genome-Wide Alignments for Species Tree Reconstruction [J].
Ane, Cecile .
GENOME BIOLOGY AND EVOLUTION, 2011, 3 :246-258
[3]  
[Anonymous], 2013, Journal of Phylogenetics and Evolutionary Biology
[4]   Networks: expanding evolutionary thinking [J].
Bapteste, Eric ;
van Ierse, Leo ;
Janke, Axel ;
Kelchner, Scot ;
Kelk, Steven ;
McInerney, James O. ;
Morrison, David A. ;
Nakhleh, Luay ;
Steel, Mike ;
Stougie, Leen ;
Whitfield, James .
TRENDS IN GENETICS, 2013, 29 (08) :439-441
[5]   Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses [J].
Bayzid, Md Shamsuzzoha ;
Mirarab, Siavash ;
Boussau, Bastien ;
Warnow, Tandy .
PLOS ONE, 2015, 10 (06)
[6]   A new approach to estimate parameters of speciation models with application to apes [J].
Becquet, Celine ;
Przeworski, Molly .
GENOME RESEARCH, 2007, 17 (10) :1505-1519
[7]   Genome-Wide Search Identifies 1.9Mb from the Polar Bear Y Chromosome for Evolutionary Analyses [J].
Bidon, Tobias ;
Schreck, Nancy ;
Hailer, Frank ;
Nilsson, Maria A. ;
Janke, Axel .
GENOME BIOLOGY AND EVOLUTION, 2015, 7 (07) :2010-2022
[8]   Inferring Species Trees Directly from Biallelic Genetic Markers: Bypassing Gene Trees in a Full Coalescent Analysis [J].
Bryant, David ;
Bouckaert, Remco ;
Felsenstein, Joseph ;
Rosenberg, Noah A. ;
RoyChoudhury, Arindam .
MOLECULAR BIOLOGY AND EVOLUTION, 2012, 29 (08) :1917-1932
[9]   Species Delimitation Using a Combined Coalescent and Information-Theoretic Approach: An Example from North American Myotis Bats [J].
Carstens, Bryan C. ;
Dewey, Tanya A. .
SYSTEMATIC BIOLOGY, 2010, 59 (04) :400-414
[10]  
Castillo-Rami S, 2010, Estimating species trees: practical and theoretical aspects, P15