A Polynomial-Time Algorithm Computing Lower and Upper Bounds of the Rooted Subtree Prune and Regraft Distance

被引:0
作者
Kannan, Lavanya [1 ]
Li, Hua [1 ]
Mushegian, Arcady [1 ,2 ]
机构
[1] Stowers Inst Med Res, Bioinformat Ctr, Kansas City, MO USA
[2] Univ Kansas, Med Ctr, Dept Microbiol Immunol & Mol Genet, Kansas City, KS 66103 USA
关键词
algorithms; combinatorics; trees; GENE-TRANSFER; EVOLUTION; EVENTS;
D O I
10.1089/cmb.2010.0045
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Rooted, leaf-labeled trees are used in biology to represent hierarchical relationships of various entities, most notably the evolutionary history of molecules and organisms. Rooted Subtree Prune and Regraft (rSPR) operation is a tree rearrangement operation that is used to transform a tree into another tree that has the same set of leaf labels. The minimum number of rSPR operations that transform one tree into another is denoted by d(rSPR) and gives a measure of dissimilarity between the trees, which can be used to compare trees obtained by different approaches, or, in the context of phylogenetic analysis, to detect horizontal gene transfer events by finding incongruences between trees of different evolving characters. The problem of computing the exact d(rSPR) measure is NP-hard, and most algorithms resort to finding sequences of rSPR operations that are sufficient for transforming one tree into another, thereby giving upper bound heuristics for the distance. In this article, we present an O(n(4)) recursive algorithm D-Clust that gives both lower bound and upper bound heuristics for the distance between trees with n shared leaves and also gives a sequence of operations that transforms one tree into another. Our experiments on simulated pairs of trees containing up to 100 leaves showed that the two bounds are almost equal for small distances, thereby giving the nearly-precise actual value, and that the upper bound tends to be close to the upper bounds given by other approaches for all pairs of trees.
引用
收藏
页码:743 / 757
页数:15
相关论文
共 22 条
  • [1] [Anonymous], 2004, Inferring phylogenies
  • [2] Bounding the number of hybridisation events for a consistent evolutionary history
    Baroni, M
    Grünewald, S
    Moulton, V
    Semple, C
    [J]. JOURNAL OF MATHEMATICAL BIOLOGY, 2005, 51 (02) : 171 - 182
  • [3] Phylogenetic identification of lateral genetic transfer events
    Beiko, RG
    Hamilton, N
    [J]. BMC EVOLUTIONARY BIOLOGY, 2006, 6 (1) : 17P
  • [4] Bordewich M., 2005, Ann Comb, V8, P409, DOI [DOI 10.1007/S00026-004-0229-Z, 10.1007/s00026-004-0229-z]
  • [5] Hierarchical structure and the prediction of missing links in networks
    Clauset, Aaron
    Moore, Cristopher
    Newman, M. E. J.
    [J]. NATURE, 2008, 453 (7191) : 98 - 101
  • [6] DIESTEL R, 2005, GRAPH THEORY GRADUAT, V173
  • [7] Evolutionary history of bacteriophages with double-stranded DNA genomes
    Glazko, Galina
    Makarenkov, Vladimir
    Liu, Jing
    Mushegian, Arcady
    [J]. BIOLOGY DIRECT, 2007, 2 (1)
  • [8] Calculating SPR distances between trees
    Goloboff, Pablo A.
    [J]. CLADISTICS, 2008, 24 (04) : 591 - 597
  • [9] Hallett M.T., 2001, P 5 ANN INT C RES CO, P149, DOI [DOI 10.1145/369133.369188, 10.1145/369133.369188]
  • [10] RECONSTRUCTING EVOLUTION OF SEQUENCES SUBJECT TO RECOMBINATION USING PARSIMONY
    HEIN, J
    [J]. MATHEMATICAL BIOSCIENCES, 1990, 98 (02) : 185 - 200