Phylogenetic Signal and Noise: Predicting the Power of a Data Set to Resolve Phylogeny

被引:104
作者
Townsend, Jeffrey P. [1 ,2 ]
Su, Zhuo [1 ]
Tekle, Yonas I. [1 ,3 ]
机构
[1] Yale Univ, Dept Ecol & Evolutionary Biol, New Haven, CT 06520 USA
[2] Yale Univ, Program Computat Biol & Bioinformat, New Haven, CT 06520 USA
[3] Spelman Coll, Dept Biol, Atlanta, GA 30341 USA
关键词
Experimental design; noise; phylogeny; polytomy; power; resolution; saturation; signal; CHARACTER-STATE SPACE; MULTIGENE ANALYSES; TREE; EVOLUTION; GENE; MITOCHONDRIAL; COMPILATION; INFORMATION; SELECTION; ALIGNMENT;
D O I
10.1093/sysbio/sys036
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
A principal objective for phylogenetic experimental design is to predict the power of a data set to resolve nodes in a phylogenetic tree. However, proactively assessing the potential for phylogenetic noise compared with signal in a candidate data set has been a formidable challenge. Understanding the impact of collection of additional sequence data to resolve recalcitrant internodes at diverse historical times will facilitate increasingly accurate and cost-effective phylogenetic research. Here, we derive theory based on the fundamental unit of the phylogenetic tree, the quartet, that applies estimates of the state space and the rates of evolution of characters in a data set to predict phylogenetic signal and phylogenetic noise and therefore to predict the power to resolve internodes. We develop and implement a Monte Carlo approach to estimating power to resolve as well as deriving a nearly equivalent faster deterministic calculation. These approaches are applied to describe the distribution of potential signal, polytomy, or noise for two example data sets, one recent (cytochrome c oxidase I and 28S ribosomal rRNA sequences from Diplazontinae parasitoid wasps) and one deep (eight nuclear genes and a phylogenomic sequence for diverse microbial eukaryotes including Stramenopiles, Alveolata, and Rhizaria). The predicted power of resolution for the loci analyzed is consistent with the historic use of the genes in phylogenetics.
引用
收藏
页码:835 / 849
页数:15
相关论文
共 49 条
[1]   A kingdom-level phylogeny of eukaryotes based on combined protein data [J].
Baldauf, SL ;
Roger, AJ ;
Wenk-Siefert, I ;
Doolittle, WF .
SCIENCE, 2000, 290 (5493) :972-977
[2]   RECONSTRUCTING THE SHAPE OF A TREE FROM OBSERVED DISSIMILARITY DATA [J].
BANDELT, HJ ;
DRESS, A .
ADVANCES IN APPLIED MATHEMATICS, 1986, 7 (03) :309-343
[3]   PHYLOGENY OF THE CARYOPHYLLALES SENSU LATO: REVISITING HYPOTHESES ON POLLINATION BIOLOGY AND PERIANTH DIFFERENTIATION IN THE CORE CARYOPHYLLALES [J].
Brockington, Samuel F. ;
Alexandre, Roolse ;
Ramdial, Jeremy ;
Moore, Michael J. ;
Crawley, Sunny ;
Dhingra, Amit ;
Hilu, Khidir ;
Soltis, Douglas E. ;
Soltis, Pamela S. .
INTERNATIONAL JOURNAL OF PLANT SCIENCES, 2009, 170 (05) :627-643
[4]   Phylogenomics Reshuffles the Eukaryotic Supergroups [J].
Burki, Fabien ;
Shalchian-Tabrizi, Kamran ;
Minge, Marianne ;
Skjaeveland, Asmund ;
Nikolaev, Sergey I. ;
Jakobsen, Kjetill S. ;
Pawlowski, Jan .
PLOS ONE, 2007, 2 (08)
[5]   Large-Scale Phylogenomic Analyses Reveal That Two Enigmatic Protist Lineages, Telonemia and Centroheliozoa, Are Related to Photosynthetic Chromalveolates [J].
Burki, Fabien ;
Inagaki, Yuji ;
Brate, Jon ;
Archibald, John M. ;
Keeling, Patrick J. ;
Cavalier-Smith, Thomas ;
Sakaguchi, Miako ;
Hashimoto, Tetsuo ;
Horak, Ales ;
Kumar, Surendra ;
Klaveness, Dag ;
Jakobsen, Kjetill S. ;
Pawlowski, Jan ;
Shalchian-Tabrizi, Kamran .
GENOME BIOLOGY AND EVOLUTION, 2009, 1 :231-238
[6]   Impact of missing data, gene choice, and taxon sampling on phylogenetic reconstruction: the Caryophyllales (angiosperms) [J].
Crawley, S. S. ;
Hilu, K. W. .
PLANT SYSTEMATICS AND EVOLUTION, 2012, 298 (02) :297-312
[7]   Identifying conflicting signal in a multigene analysis reveals a highly resolved tree: The phylogeny of Rodentia (Mammalia) [J].
DeBry, RW .
SYSTEMATIC BIOLOGY, 2003, 52 (05) :604-617
[8]   Sequence length bounds for resolving a deep phylogenetic divergence [J].
Fischer, Mareike ;
Steel, Mike .
JOURNAL OF THEORETICAL BIOLOGY, 2009, 256 (02) :247-252
[9]   Phylogenetic information and experimental design in molecular systematics [J].
Goldman, N .
PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 1998, 265 (1407) :1779-1786
[10]   Is it better to add taxa or characters to a difficult phylogenetic problem? [J].
Graybeal, A .
SYSTEMATIC BIOLOGY, 1998, 47 (01) :9-17