A Polynomial-Time Algorithm Computing Lower and Upper Bounds of the Rooted Subtree Prune and Regraft Distance

被引:0
作者
Kannan, Lavanya [1 ]
Li, Hua [1 ]
Mushegian, Arcady [1 ,2 ]
机构
[1] Stowers Inst Med Res, Bioinformat Ctr, Kansas City, MO USA
[2] Univ Kansas, Med Ctr, Dept Microbiol Immunol & Mol Genet, Kansas City, KS 66103 USA
关键词
algorithms; combinatorics; trees; GENE-TRANSFER; EVOLUTION; EVENTS;
D O I
10.1089/cmb.2010.0045
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Rooted, leaf-labeled trees are used in biology to represent hierarchical relationships of various entities, most notably the evolutionary history of molecules and organisms. Rooted Subtree Prune and Regraft (rSPR) operation is a tree rearrangement operation that is used to transform a tree into another tree that has the same set of leaf labels. The minimum number of rSPR operations that transform one tree into another is denoted by d(rSPR) and gives a measure of dissimilarity between the trees, which can be used to compare trees obtained by different approaches, or, in the context of phylogenetic analysis, to detect horizontal gene transfer events by finding incongruences between trees of different evolving characters. The problem of computing the exact d(rSPR) measure is NP-hard, and most algorithms resort to finding sequences of rSPR operations that are sufficient for transforming one tree into another, thereby giving upper bound heuristics for the distance. In this article, we present an O(n(4)) recursive algorithm D-Clust that gives both lower bound and upper bound heuristics for the distance between trees with n shared leaves and also gives a sequence of operations that transforms one tree into another. Our experiments on simulated pairs of trees containing up to 100 leaves showed that the two bounds are almost equal for small distances, thereby giving the nearly-precise actual value, and that the upper bound tends to be close to the upper bounds given by other approaches for all pairs of trees.
引用
收藏
页码:743 / 757
页数:15
相关论文
共 22 条