Novel Information Theory-Based Measures for Quantifying Incongruence among Phylogenetic Trees

被引:204
作者
Salichos, Leonidas [1 ]
Stamatakis, Alexandros [2 ,3 ]
Rokas, Antonis [1 ,4 ]
机构
[1] Vanderbilt Univ, Dept Biol Sci, Nashville, TN 37235 USA
[2] Heidelberg Inst Theoret Studies, Sci Comp Grp, Exelixis Lab, Heidelberg, Germany
[3] Karlsruhe Inst Technol, Inst Theoret Informat, D-76021 Karlsruhe, Germany
[4] Vanderbilt Univ, Med Ctr, Dept Biomed Informat, Nashville, TN USA
基金
美国国家科学基金会;
关键词
internode certainty; bipartition; split; clade support; rare genomic changes; RAxML; DNA-SEQUENCE DATA; SPECIES PHYLOGENIES; GENE TREES; TESTS; SCALE; TOPOLOGIES; CONFIDENCE; EVOLUTION; ORIGIN; TAXA;
D O I
10.1093/molbev/msu061
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Phylogenies inferred from different data matrices often conflict with each other necessitating the development of measures that quantify this incongruence. Here, we introduce novel measures that use information theory to quantify the degree of conflict or incongruence among all nontrivial bipartitions present in a set of trees. The first measure, internode certainty (IC), calculates the degree of certainty for a given internode by considering the frequency of the bipartition defined by the internode (internal branch) in a given set of trees jointly with that of the most prevalent conflicting bipartition in the same tree set. The second measure, IC All (ICA), calculates the degree of certainty for a given internode by considering the frequency of the bipartition defined by the internode in a given set of trees in conjunction with that of all conflicting bipartitions in the same underlying tree set. Finally, the tree certainty (TC) and TC All (TCA) measures are the sum of IC and ICA values across all internodes of a phylogeny, respectively. IC, ICA, TC, and TCA can be calculated from different types of data that contain nontrivial bipartitions, including from bootstrap replicate trees to gene trees or individual characters. Given a set of phylogenetic trees, the IC and ICA values of a given internode reflect its specific degree of incongruence, and the TC and TCA values describe the global degree of incongruence between trees in the set. All four measures are implemented and freely available in version 8.0.0 and subsequent versions of the widely used program RAxML.
引用
收藏
页码:1261 / 1271
页数:11
相关论文
共 62 条
[1]   Parallelized phylogenetic post-analysis on multi-core architectures [J].
Aberer, Andre J. ;
Pattengale, Nicholas D. ;
Stamatakis, Alexandros .
JOURNAL OF COMPUTATIONAL SCIENCE, 2010, 1 (02) :107-114
[2]   Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative [J].
Anisimova, Maria ;
Gascuel, Olivier .
SYSTEMATIC BIOLOGY, 2006, 55 (04) :539-552
[3]  
[Anonymous], 2002, PAUP*. Phylogenetic Analysis Using Parsimony (*and other methods). Version 4
[4]   Multiple sources of character information and the phylogeny of Hawaiian Drosophilids [J].
Baker, RH ;
DeSalle, R .
SYSTEMATIC BIOLOGY, 1997, 46 (04) :654-673
[5]   Large-Scale Parsimony Analysis of Metazoan Indels in Protein-Coding Genes [J].
Belinky, Frida ;
Cohen, Ofir ;
Huchon, Dorothee .
MOLECULAR BIOLOGY AND EVOLUTION, 2010, 27 (02) :441-451
[6]  
Bryant D., 2003, Bioconsensus. DIMACS Working Group Meetings on Bioconsensus, P163
[7]   PARTITIONING AND COMBINING DATA IN PHYLOGENETIC ANALYSIS [J].
BULL, JJ ;
HUELSENBECK, JP ;
CUNNINGHAM, CW ;
SWOFFORD, DL ;
WADDELL, PJ .
SYSTEMATIC BIOLOGY, 1993, 42 (03) :384-397
[8]  
CUMMINGS MP, 1995, MOL BIOL EVOL, V12, P814
[9]   Can three incongruence tests predict when data should be combined? [J].
Cunningham, CW .
MOLECULAR BIOLOGY AND EVOLUTION, 1997, 14 (07) :733-740
[10]   Gene tree discordance, phylogenetic inference and the multispecies coalescent [J].
Degnan, James H. ;
Rosenberg, Noah A. .
TRENDS IN ECOLOGY & EVOLUTION, 2009, 24 (06) :332-340