Approximate Pattern Matching with the L1, L2 and L∞ Metrics

被引:0
作者
Ohad Lipsky
Ely Porat
机构
[1] Bar-Ilan University,Department of Computer Science
来源
Algorithmica | 2011年 / 60卷
关键词
Design and analysis of algorithms; Combinatorial algorithms on words; Approximate string matching; Hamming distance;
D O I
暂无
中图分类号
学科分类号
摘要
Given an alphabet Σ={1,2,…,|Σ|} text string T∈Σn and a pattern string P∈Σm, for each i=1,2,…,n−m+1 define Lp(i) as the p-norm distance when the pattern is aligned below the text and starts at position i of the text. The problem of pattern matching with Lp distance is to compute Lp(i) for every i=1,2,…,n−m+1. We discuss the problem for d=1,2,∞. First, in the case of L1 matching (pattern matching with an L1 distance) we show a reduction of the string matching with mismatches problem to the L1 matching problem and we present an algorithm that approximates the L1 matching up to a factor of 1+ε, which has an \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$O(\frac{1}{\varepsilon^{2}}n\log m\log|\Sigma|)$\end{document} run time. Then, the L2 matching problem (pattern matching with an L2 distance) is solved with a simple O(nlog m) time algorithm. Finally, we provide an algorithm that approximates the L∞ matching up to a factor of 1+ε with a run time of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$O(\frac{1}{\varepsilon}n\log m\log|\Sigma|)$\end{document} . We also generalize the problem of String Matching with mismatches to have weighted mismatches and present an O(nlog 4m) algorithm that approximates the results of this problem up to a factor of O(log m) in the case that the weight function is a metric.
引用
收藏
页码:335 / 348
页数:13
相关论文
共 21 条
[1]  
Abrahamson K.(1987)Generalized string matching SIAM J. Comput. 16 1039-1051
[2]  
Amir A.(1995)Efficient 2-dimensional approximate matching of half-rectangular figures Inf. Comput. 118 1-11
[3]  
Farach M.(1994)Alphabet dependence in parameterized matching Inf. Process. Lett. 49 111-115
[4]  
Amir A.(2004)Faster algorithms for string matching with k mismatches J. Algorithms 50 257-275
[5]  
Farach M.(2001)A randomized algorithm for approximate string matching Algorithmica 29 468-486
[6]  
Muthukrishnan S.(1993)Fast algorithms for approximately counting mismatches Inf. Process. Lett. 48 53-60
[7]  
Amir A.(1977)Fast pattern matching in strings SIAM J. Comput. 6 323-350
[8]  
Lewenstein M.(1966)Binary codes capable of correcting, deletions, insertions and reversals Sov. Phys. Dokl. 10 707-710
[9]  
Porat E.(1980)A faster algorithm for computing string-edit distances J. Comput. Syst. Sci. 20 18-31
[10]  
Atallah M.J.(1995)String matching under a general matching relation Inf. Comput. 122 140-148