Faster Approximate String Matching for Short Patterns

被引:2
作者
Bille, Philip [1 ]
机构
[1] Tech Univ Denmark, DK-2800 Lyngby, Denmark
关键词
Algorithms; Approximate string matching; Word level parallelism; ALGORITHM; PARALLEL;
D O I
10.1007/s00224-011-9322-y
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We study the classical approximate string matching problem, that is, given strings P and Q and an error threshold k, find all ending positions of substrings of Q whose edit distance to P is at most k. Let P and Q have lengths m and n, respectively. On a standard unit-cost word RAM with word size w >= log n we present an algorithm using time O(nk . min(log(2) m/log n, log(2) m log w/w) + n) When P is short, namely, m = 2(o(root logn)) or m = 2(o(root w/logw) this improves the previously best known time bounds for the problem. The result is achieved using a novel implementation of the Landau-Vishkin algorithm based on tabulation and word-level parallelism.
引用
收藏
页码:492 / 515
页数:24
相关论文
共 32 条
[1]   Improved parallel integer sorting without concurrent writing [J].
Albers, S ;
Hagerup, T .
INFORMATION AND COMPUTATION, 1997, 136 (01) :25-51
[2]   Nearest common ancestors: A survey and a new algorithm for a distributed environment [J].
Alstrup, S ;
Gavoille, C ;
Kaplan, H ;
Rauhe, T .
THEORY OF COMPUTING SYSTEMS, 2004, 37 (03) :441-456
[3]   Sorting in linear time? [J].
Andersson, A ;
Hagerup, T ;
Nilsson, S ;
Raman, R .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1998, 57 (01) :74-93
[4]  
Arlazarov V., 1970, SOV MATH DOKL, V11, P1209
[5]  
Baeza-Yates R.A., 1996, LECT NOTES COMPUTER, V1075, P1
[6]   A NEW APPROACH TO TEXT SEARCHING [J].
BAEZAYATES, R ;
GONNET, GH .
COMMUNICATIONS OF THE ACM, 1992, 35 (10) :74-82
[7]  
Batcher Kenneth E., 1968, P APRIL 30 MAY2 1968, V32, P307, DOI [DOI 10.1145/1468075.1468121, 10.1145/1468075.1468121]
[8]   The LCA problem revisited [J].
Bender, MA ;
Farach-Colton, M .
LATIN 2000: THEORETICAL INFORMATICS, 2000, 1776 :88-94
[9]   Fast and compact regular expression matching [J].
Bille, Philip ;
Farach-Colton, Martin .
THEORETICAL COMPUTER SCIENCE, 2008, 409 (03) :486-496
[10]   Approximate string matching: A simpler faster algorithm [J].
Cole, R ;
Hariharan, R .
SIAM JOURNAL ON COMPUTING, 2002, 31 (06) :1761-1782