Fast Arc-Annotated Subsequence Matching in Linear Space

被引:0
作者
Philip Bille
Inge Li Gørtz
机构
[1] Technical University of Denmark,Informatics and Mathematical Modelling
来源
Algorithmica | 2012年 / 62卷
关键词
Subsequence matching; Algorithms; Arc-annotated strings;
D O I
暂无
中图分类号
学科分类号
摘要
An arc-annotated string is a string of characters, called bases, augmented with a set of pairs, called arcs, each connecting two bases. Given arc-annotated strings P and Q the arc-preserving subsequence problem is to determine if P can be obtained from Q by deleting bases from Q. Whenever a base is deleted any arc with an endpoint in that base is also deleted. Arc-annotated strings where the arcs are “nested” are a natural model of RNA molecules that captures both the primary and secondary structure of these. The arc-preserving subsequence problem for nested arc-annotated strings is basic primitive for investigating the function of RNA molecules. Gramm et al. (ACM Trans. Algorithms 2(1): 44–65, 2006) gave an algorithm for this problem using O(nm) time and space, where m and n are the lengths of P and Q, respectively. In this paper we present a new algorithm using O(nm) time and O(n+m) space, thereby matching the previous time bound while significantly reducing the space from a quadratic term to linear. This is essential to process large RNA molecules where the space is likely to be a bottleneck. To obtain our result we introduce several novel ideas which may be of independent interest for related problems on arc-annotated strings.
引用
收藏
页码:209 / 223
页数:14
相关论文
共 36 条
[1]  
Alber J.(2004)Computing the similarity of two sequences with nested arc annotations Theor. Comput. Sci. 312 337-358
[2]  
Gramm J.(1998)More efficient algorithm for ordered tree inclusion J. Algorithms 26 370-385
[3]  
Guo J.(2006)A remark on the subsequence problem for arc-annotated sequences with pairwise nested arcs Inf. Process. Lett. 100 64-68
[4]  
Niedermeier R.(1999)Conserved features of Y RNAs revealed by automated phylogenetic secondary structure analysis Nucl. Acids Res. 27 1070-1078
[5]  
Chen W.(2006)Pattern matching for arc-annotated sequences ACM Trans. Algorithms 2 44-65
[6]  
Damaschke P.(1984)Fast algorithms for finding nearest common ancestors SIAM J. Comput. 13 338-355
[7]  
Farris A.(1996)Protonatable hairpins are conserved in the 5′-untranslated region of tymovirus RNAs Nucl. Acids Res. 24 4910-4917
[8]  
Koelsch G.(2000)Evidence for evolutionarily conserved secondary structure in the H19 tumor suppressor RNA Nucl. Acids Res. 28 1221-1227
[9]  
Pruijn G.(1995)Ordered and unordered tree inclusion SIAM J. Comput. 24 340-356
[10]  
van Venrooij W.(2002)The longest common subsequence problem for sequences with nested arc annotations J. Comput. Syst. Sci. 65 465-480