On the bottleneck tree alignment problems

被引:2
作者
Chen, Yen Hung [1 ]
Tang, Chuan Yi [2 ]
机构
[1] Taipei Municipal Univ Educ, Dept Comp Sci, Taipei 10048, Taiwan
[2] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu 300, Taiwan
关键词
Computational biology; NP-complete; Approximation algorithms; Edit distance; Metric; Ultrametric; Bottleneck tree alignment; Multiple sequence alignment; Phylogenetic tree; MULTIPLE SEQUENCE ALIGNMENT; APPROXIMATION ALGORITHMS;
D O I
10.1016/j.ins.2010.02.008
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Given a set W of k sequences and a tree T with k leaves labeled with a unique sequence in W, a label tree is used to assign a sequence label to each internal node of the tree T. The cost of a tree edge is defined as the distance, such as the Hamming distance or the Levenshtein (edit) distance, between the sequence labels of a pair of nodes in the edge. The bottleneck edge of a label tree is the edge with the maximum cost in the label tree. The bottleneck tree alignment problem is concerned with the determination of a label tree with minimum cost of the bottleneck edge. A lifted label tree is a label tree in which the sequence label of each internal node is equal to the sequence label of some child of the node. The bottleneck lifted tree alignment problem involves the minimization of cost of the bottleneck edge among all possible lifted label trees of the tree T. This paper shows that when the distance function is a metric, the bottleneck tree alignment problem is NP-complete even when the tree structure resembles a binary tree. For special cases, an exact algorithm is used to solve the bottleneck lifted tree alignment problem in polynomial time. A 2(h - 1)-approximation algorithm is used to solve the bottleneck tree alignment problem, where h is the height of the tree T. It is observed that the bound is existentially tight. Finally, this paper shows that any lifted label tree is an optimal solution to the bottleneck tree alignment problem if the distance function is an ultrametric. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:2134 / 2141
页数:8
相关论文
共 32 条
[1]  
Aho Alfred V., 1990, Handbook of Theoretical Computer Science, P255, DOI DOI 10.1016/B978-0-444-88071-0.50010-2
[2]   TREES, STARS, AND MULTIPLE BIOLOGICAL SEQUENCE ALIGNMENT [J].
ALTSCHUL, SF ;
LIPMAN, DJ .
SIAM JOURNAL ON APPLIED MATHEMATICS, 1989, 49 (01) :197-209
[3]  
[Anonymous], 1997, Introduction to computational molecular biology
[4]  
[Anonymous], 1997, ACM SIGACT NEWS
[5]   Approximation algorithms for multiple sequence alignment [J].
Bafna, V ;
Lawler, EL ;
Pevzner, PA .
THEORETICAL COMPUTER SCIENCE, 1997, 182 (1-2) :233-244
[6]   Algorithms for phylogenetic footprinting [J].
Blanchette, M ;
Schwikowski, B ;
Tompa, M .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (02) :211-223
[7]   The complexity of multiple sequence alignment with SP-score that is a metric [J].
Bonizzoni, P ;
Della Vedova, G .
THEORETICAL COMPUTER SCIENCE, 2001, 259 (1-2) :63-79
[8]  
CHAN SC, 1992, B MATH BIOL, V54, P563, DOI 10.1016/S0092-8240(05)80077-1
[9]   A hierarchical model for incomplete alignments in phylogenetic inference [J].
Cheng, Fuxia ;
Hartmann, Stefanie ;
Gupta, Mayetri ;
Ibrahim, Joseph G. ;
Vision, Todd J. .
BIOINFORMATICS, 2009, 25 (05) :592-598
[10]   A new algorithm for computing similarity between RNA structures [J].
Collins, GD ;
Le, SY ;
Zhang, KZ .
INFORMATION SCIENCES, 2001, 139 (1-2) :59-77