Accuracy of phylogeny reconstruction methods combining overlapping gene data sets

被引:41
作者
Kupczok, Anne [1 ]
Schmidt, Heiko A. [1 ]
von Haeseler, Arndt [1 ]
机构
[1] Univ Vet Med Vienna, Med Univ Vienna, Univ Vienna, Ctr Integrat Bioinformat Vienna,Max F Perutz Labs, A-1030 Vienna, Austria
基金
奥地利科学基金会;
关键词
MATRIX REPRESENTATION; SUPERTREE CONSTRUCTION; AVERAGE CONSENSUS; ROOTED TREES; MINIMUM-FLIP; EVOLUTION; INCONSISTENCY; SIMULATION; PARSIMONY; SEQUENCES;
D O I
10.1186/1748-7188-5-37
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The availability of many gene alignments with overlapping taxon sets raises the question of which strategy is the best to infer species phylogenies from multiple gene information. Methods and programs abound that use the gene alignment in different ways to reconstruct the species tree. In particular, different methods combine the original data at different points along the way from the underlying sequences to the final tree. Accordingly, they are classified into superalignment, supertree and medium-level approaches. Here, we present a simulation study to compare different methods from each of these three approaches. Results: We observe that superalignment methods usually outperform the other approaches over a wide range of parameters including sparse data and gene-specific evolutionary parameters. In the presence of high incongruency among gene trees, however, other combination methods show better performance than the superalignment approach. Surprisingly, some supertree and medium-level methods exhibit, on average, worse results than a single gene phylogeny with complete taxon information. Conclusions: For some methods, using the reconstructed gene tree as an estimation of the species tree is superior to the combination of incomplete information. Superalignment usually performs best since it is less susceptible to stochastic error. Supertree methods can outperform superalignment in the presence of gene-tree conflict.
引用
收藏
页数:17
相关论文
共 83 条
[1]   INFERRING A TREE FROM LOWEST COMMON ANCESTORS WITH AN APPLICATION TO THE OPTIMIZATION OF RELATIONAL EXPRESSIONS [J].
AHO, AV ;
SAGIV, Y ;
SZYMANSKI, TG ;
ULLMAN, JD .
SIAM JOURNAL ON COMPUTING, 1981, 10 (03) :405-421
[2]  
[Anonymous], 2004, Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life
[3]   Complete Generic-Level Phylogenetic Analyses of Palms (Arecaceae) with Comparisons of Supertree and Supermatrix Approaches [J].
Baker, William J. ;
Savolainen, Vincent ;
Asmussen-Lange, Conny B. ;
Chase, Mark W. ;
Dransfield, John ;
Forest, Felix ;
Harley, Madeline M. ;
Uhl, Natalie W. ;
Wilkinson, Mark .
SYSTEMATIC BIOLOGY, 2009, 58 (02) :240-256
[4]   AGAINST CONSENSUS [J].
BARRETT, M ;
DONOGHUE, MJ ;
SOBER, E .
SYSTEMATIC ZOOLOGY, 1991, 40 (04) :486-493
[5]  
Baum B.R., 2004, COMPU BIOL, P17
[7]   The delayed rise of present-day mammals [J].
Bininda-Emonds, Olaf R. P. ;
Cardillo, Marcel ;
Jones, Kate E. ;
MacPhee, Ross D. E. ;
Beck, Robin M. D. ;
Grenyer, Richard ;
Price, Samantha A. ;
Vos, Rutger A. ;
Gittleman, John L. ;
Purvis, Andy .
NATURE, 2007, 446 (7135) :507-512
[8]   Novel versus unsupported clades: Assessing the qualitative support for clades in MRP supertrees [J].
Bininda-Emonds, ORP .
SYSTEMATIC BIOLOGY, 2003, 52 (06) :839-848
[9]   Assessment of the accuracy of matrix representation with parsimony analysis supertree construction [J].
Bininda-Emonds, ORP ;
Sanderson, MJ .
SYSTEMATIC BIOLOGY, 2001, 50 (04) :565-579
[10]   PARTITIONING AND COMBINING DATA IN PHYLOGENETIC ANALYSIS [J].
BULL, JJ ;
HUELSENBECK, JP ;
CUNNINGHAM, CW ;
SWOFFORD, DL ;
WADDELL, PJ .
SYSTEMATIC BIOLOGY, 1993, 42 (03) :384-397