Bit-parallel computation of local similarity score matrices with unitary weights

被引:3
作者
Hyyro, Heikki [1 ]
Navarro, Gonzalo [1 ]
机构
[1] Univ Tampere, Dept Comp Sci, FIN-33101 Tampere, Finland
关键词
local similarity; approximate string matching; bit-parallelism;
D O I
10.1142/S0129054106004443
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Local similarity computation between two sequences permits detecting all the relevant alignments present between subsequences thereof. A well-known dynamic programming algorithm works in time O(mn), m and n being the lengths of the subsequences. The algorithm is rather slow when applied over many sequence pairs. In this paper we present the first bit-parallel computation of the score matrix, for a simplified choice of scores. If the computer word has w bits, then the resulting algorithm works in O(mn log min(m, n, w)/w) time, achieving up to 8-fold speedups in practice. Some DNA comparison applications use precisely the simplified scores we handle, and thus our algorithm is directly applicable. In others, our method could be used as a raw filter to discard most of the strings, so the classical algorithm can be focused only on the substring pairs that can yield relevant results.
引用
收藏
页码:1325 / 1344
页数:20
相关论文
共 14 条
[1]   A NEW APPROACH TO TEXT SEARCHING [J].
BAEZAYATES, R ;
GONNET, GH .
COMMUNICATIONS OF THE ACM, 1992, 35 (10) :74-82
[2]  
Bergeron A., 2002, International Journal of Foundations of Computer Science, V13, P53, DOI 10.1142/S0129054102000947
[3]  
Crochemore M, 2003, FUND INFORM, V56, P1
[4]  
Crochemore M, 2002, SIAM PROC S, P679
[5]  
GUSFIELD D, 1997, ALGORITHMS STRING ST
[6]   Bit-parallel witnesses and their applications to approximate string matching [J].
Hyyrö, H ;
Navarro, G .
ALGORITHMICA, 2005, 41 (03) :203-231
[7]  
Hyyro H, 2001, Explaining and Extending the Bit-Parallel Approximate String-Matching Algorithm of Myers
[8]   A FASTER ALGORITHM COMPUTING STRING EDIT DISTANCES [J].
MASEK, WJ ;
PATERSON, MS .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1980, 20 (01) :18-31
[9]   A fast bit-vector algorithm for approximate string matching based on dynamic programming [J].
Myers, G .
JOURNAL OF THE ACM, 1999, 46 (03) :395-415
[10]   A guided tour to approximate string matching [J].
Navarro, G .
ACM COMPUTING SURVEYS, 2001, 33 (01) :31-88