A New Method for Handling Missing Species in Diversification Analysis Applicable to Randomly or Nonrandomly Sampled Phylogenies

被引:52
作者
Cusimano, Natalie [1 ]
Stadler, Tanja [2 ]
Renner, Susanne S. [1 ]
机构
[1] Univ Munich LMU, Fac Biol, D-80638 Munich, Germany
[2] ETH, Inst Integrat Biol, Dept Environm Syst Sci, CH-8092 Zurich, Switzerland
关键词
Birth-death likelihood analysis; diversification rates; missing-species-problem; model fitting; nonrandom species sampling; gamma statistic; BIRTH-DEATH-MODELS; RATES; ARACEAE; SHIFTS; TREES; INFERENCES; SPECIATION; TEMPO;
D O I
10.1093/sysbio/sys031
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Chronograms from molecular dating are increasingly being used to infer rates of diversification and their change over time. A major limitation in such analyses is incomplete species sampling that moreover is usually nonrandom. While the widely used gamma statistic with the Monte Carlo constant-rates test or the birth death likelihood analysis with the Delta AICrc test statistic are appropriate for comparing the fit of different diversification models in phylogenies with random species sampling, no objective automated method has been developed for fitting diversification models to nonrandomly sampled phylogenies. Here, we introduce a novel approach, CorSiM, which involves simulating missing splits under a constant rate birth death model and allows the user to specify whether species sampling in the phylogeny being analyzed is random or nonrandom. The completed trees can be used in subsequent model-fitting analyses. This is fundamentally different from previous diversification rate estimation methods, which were based on null distributions derived from the incomplete trees. CorSiM is automated in an R package and can easily be applied to large data sets. We illustrate the approach in two Araceae clades, one with a random species sampling of 52% and one with a nonrandom sampling of 55%. In the latter clade, the CorSiM approach detects and quantifies an increase in diversification rate, whereas classic approaches prefer a constant rate model; in the former clade, results do not differ among methods (as indeed expected since the classic approaches are valid only for randomly sampled phylogenies). The CorSiM method greatly reduces the type 1 error in diversification analysis, but type II error remains a methodological problem.
引用
收藏
页码:785 / 792
页数:8
相关论文
共 37 条
[31]   SHIFTS IN DIVERSIFICATION RATE WITH THE ORIGIN OF ANGIOSPERMS [J].
SANDERSON, MJ ;
DONOGHUE, MJ .
SCIENCE, 1994, 264 (5165) :1590-1593
[32]   DOES CLADISTIC INFORMATION AFFECT INFERENCES ABOUT BRANCHING RATES [J].
SANDERSON, MJ ;
BHARATHAN, G .
SYSTEMATIC BIOLOGY, 1993, 42 (01) :1-17
[33]  
Stadler T., ESTIMATING IN PRESS
[34]   Simulating Trees with a Fixed Number of Extant Species [J].
Stadler, Tanja .
SYSTEMATIC BIOLOGY, 2011, 60 (05) :676-684
[35]   Mammalian phylogeny reveals recent diversification rate shifts [J].
Stadler, Tanja .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (15) :6187-6192
[36]   On incomplete sampling under birth-death models and connections to the sampling-based coalescent [J].
Stadler, Tanja .
JOURNAL OF THEORETICAL BIOLOGY, 2009, 261 (01) :58-66
[37]   Lineages-through-time plots of neutral models for speciation [J].
Stadler, Tanja .
MATHEMATICAL BIOSCIENCES, 2008, 216 (02) :163-171