A randomized algorithm for approximate string matching

被引:28
|
作者
Atallah, MJ [1 ]
Chyzak, F
Dumas, P
机构
[1] Purdue Univ, CERIAS, W Lafayette, IN 47907 USA
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[3] Inst Natl Rech Informat & Automat, F-78153 Le Chesnay, France
关键词
convolution; FFT; approximate string matching; randomized algorithms;
D O I
10.1007/s004530010062
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We give a randomized algorithm in deterministic time O(N log M) for estimating the score vector of matches between a text string of length N and a pattern string of length M, i.e., the vector obtained when the pattern is slid along the text, and the number of matches is counted for each position. A direct application is approximate string matching. The randomized algorithm uses convolution to find an estimator of the scores; the variance of the estimator is particularly small for scores that are close to M, i.e., for approximate occurrences of the pattern in the text. No assumption is made about the probabilistic characteristics of the input, or about the size of the alphabet. The solution extends to string matching with classes, class complements, "never match" and "always match" symbols, to the weighted case and to higher dimensions.
引用
收藏
页码:468 / 486
页数:19
相关论文
共 50 条
  • [21] Faster Approximate String Matching for Short Patterns
    Bille, Philip
    THEORY OF COMPUTING SYSTEMS, 2012, 50 (03) : 492 - 515
  • [22] Intuitionistic Fuzzy Automaton for Approximate String Matching
    Ravi, K. M.
    Choubey, A.
    Tripati, K. K.
    FUZZY INFORMATION AND ENGINEERING, 2014, 6 (01) : 29 - 39
  • [23] APPROXIMATE STRING-MATCHING WITH SUFFIX AUTOMATA
    UKKONEN, E
    WOOD, D
    ALGORITHMICA, 1993, 10 (05) : 353 - 364
  • [24] Approximate string matching with stuck address bits
    Amir, Amihood
    Eisenberg, Estrella
    Keller, Orgad
    Levy, Avivit
    Porat, Ely
    THEORETICAL COMPUTER SCIENCE, 2011, 412 (29) : 3537 - 3544
  • [25] Faster Approximate String Matching for Short Patterns
    Philip Bille
    Theory of Computing Systems, 2012, 50 : 492 - 515
  • [26] A FAST VLSI SOLUTION FOR APPROXIMATE STRING MATCHING
    GROSSI, R
    INTEGRATION-THE VLSI JOURNAL, 1992, 13 (02) : 195 - 206
  • [27] Approximate string matching based on bit operations
    Hanmei, E.
    Yu, Yunqing
    Baba, Kensuke
    Murakami, Kazuaki
    RECENT PROGRESS IN COMPUTATIONAL SCIENCES AND ENGINEERING, VOLS 7A AND 7B, 2006, 7A-B : 195 - 198
  • [28] Optimal implementations of the approximate string matching and the approximate discrete signal matching on the memory machine models
    Nakano, Koji
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2014, 29 (02) : 104 - 118
  • [29] An accurate toponym-matching measure based on approximate string matching
    Kilinc, Deniz
    JOURNAL OF INFORMATION SCIENCE, 2016, 42 (02) : 138 - 149
  • [30] A Parallel Algorithm for Fixed-Length Approximate String-Matching with k-mismatches
    Crochemore, Maxime
    Iliopoulos, Costas S.
    Pissis, Solon P.
    ALGORITHMS AND APPLICATIONS: ESSAYS DEDICATED TO ESKO UKKONEN ON THE OCCASION OF HIS 60TH BIRTHDAY, 2010, 6060 : 92 - 101