Constructing a Gene Team Tree in Almost O(n lg n) Time

被引:3
|
作者
Wang, Biing-Feng [1 ]
Lin, Chien-Hsin [1 ]
Yang, I-Tse [1 ]
机构
[1] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu 30013, Taiwan
关键词
Algorithms; data structures; gene teams; comparative genomics; conserved gene clusters; COMMON INTERVALS; CLUSTERS; ALGORITHMS; CONSERVATION; OPERONS; ORDER; SETS;
D O I
10.1109/TCBB.2013.150
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
An important model of a conserved gene cluster is called the gene team model, in which a chromosome is defined to be a permutation of distinct genes and a gene team is defined to be a set of genes that appear in two or more species, with the distance between adjacent genes in the team for each chromosome always no more than a certain threshold delta. A gene team tree is a succinct way to represent all gene teams for every possible value of delta. The previous fastest algorithm for constructing a gene team tree of two chromosomes requires O(n lg n lglg n) time, which was given by Wang and Lin. Its bottleneck is a problem called the maximum-gap problem. In this paper, by presenting an improved algorithm for the maximum-gap problem, we reduce the upper bound of the gene team tree problem to O(n lg n adn). Since a grows extremely slowly, this result is almost as efficient as the current best upper bound, O(n lg n), for finding the gene teams of a fixed d value. Our new algorithm is very efficient from both the theoretical and practical points of view. Wang and Lin's gene-team-tree algorithm can be extended to k chromosomes with complexity O(kn lg n lglg n). Similarly, our improved algorithm for the maximum-gap problem reduces this running time to O(kn lg n adn). In addition, it also provides new upper bounds for the gene team tree problem on general sequences, in which multiple copies of the same gene are allowed.
引用
收藏
页码:142 / 153
页数:12
相关论文
共 50 条