Pruning Rogue Taxa Improves Phylogenetic Accuracy: An Efficient Algorithm and Webservice

被引:290
作者
Aberer, Andre J. [1 ]
Krompass, Denis [1 ]
Stamatakis, Alexandros [1 ]
机构
[1] Heidelberg Inst Theoret Studies HITS gGmbH, Sci Comp Grp, Exelixis Lab, D-69118 Heidelberg, Germany
关键词
Bootstrap support; consensus tree; phylogenetic postanalysis; rogue taxa; software; webservice; INFERENCE; CONSENSUS; TREE;
D O I
10.1093/sysbio/sys078
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The presence of rogue taxa (rogues) in a set of trees can frequently have a negative impact on the results of a bootstrap analysis (e.g., the overall support in consensus trees). We introduce an efficient graph-based algorithm for rogue taxon identification as well as an interactive webservice implementing this algorithm. Compared with our previous method, the new algorithm is up to 4 orders of magnitude faster, while returning qualitatively identical results. Because of this significant improvement in scalability, the new algorithm can now identify substantially more complex and compute-intensive rogue taxon constellations. On a large and diverse collection of real-world data sets, we show that our method yields better supported reduced/pruned consensus trees than any competing rogue taxon identification method. Using the parallel version of our open-source code, we successfully identified rogue taxa in a set of 100 trees with 116 334 taxa each. For simulated data sets, we show that when removing/pruning rogue taxa with our method from a tree set, we consistently obtain bootstrap consensus trees as well as maximum-likelihood trees that are topologically closer to the respective true trees.
引用
收藏
页码:162 / 166
页数:5
相关论文
共 17 条
[1]   A Simple and Accurate Method for Rogue Taxon Identification [J].
Aberer, Andre J. ;
Stamatakis, Alexandros .
2011 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM 2011), 2011, :118-122
[2]  
Bryant D., 2003, Bioconsensus. DIMACS Working Group Meetings on Bioconsensus, P163
[3]   Broad phylogenomic sampling improves resolution of the animal tree of life [J].
Dunn, Casey W. ;
Hejnol, Andreas ;
Matus, David Q. ;
Pang, Kevin ;
Browne, William E. ;
Smith, Stephen A. ;
Seaver, Elaine ;
Rouse, Greg W. ;
Obst, Matthias ;
Edgecombe, Gregory D. ;
Sorensen, Martin V. ;
Haddock, Steven H. D. ;
Schmidt-Rhaesa, Andreas ;
Okusu, Akiko ;
Kristensen, Reinhardt Mobjerg ;
Wheeler, Ward C. ;
Martindale, Mark Q. ;
Giribet, Gonzalo .
NATURE, 2008, 452 (7188) :745-U5
[4]  
FELSENSTEIN J, 1985, EVOLUTION, V39, P783, DOI 10.1111/j.1558-5646.1985.tb00420.x
[5]  
Maddison W.P., 2008, MESQUITE MODULAR SYS, DOI DOI 10.1111/J.1558-5646.2008.00349.X
[6]   Uncovering Hidden Phylogenetic Consensus in Large Data Sets [J].
Pattengale, Nicholas D. ;
Aberer, Andre J. ;
Swenson, Krister M. ;
Stamatakis, Alexandros ;
Moret, Bernard M. E. .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2011, 8 (04) :902-911
[7]   MrBayes 3: Bayesian phylogenetic inference under mixed models [J].
Ronquist, F ;
Huelsenbeck, JP .
BIOINFORMATICS, 2003, 19 (12) :1572-1574
[8]   Troubleshooting molecular phylogenetic analyses [J].
Sanderson, MJ ;
Shaffer, HB .
ANNUAL REVIEW OF ECOLOGY AND SYSTEMATICS, 2002, 33 :49-72
[9]   Placing the mountain goat: A total evidence approach to testing alternative hypotheses [J].
Shafer, Aaron B. A. ;
Hall, Jocelyn C. .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2010, 55 (01) :18-25
[10]   Phylogenetic-Signal Dissection of Nuclear Housekeeping Genes Supports the Paraphyly of Sponges and the Monophyly of Eumetazoa [J].
Sperling, Erik A. ;
Peterson, Kevin J. ;
Pisani, Davide .
MOLECULAR BIOLOGY AND EVOLUTION, 2009, 26 (10) :2261-2274