Hardness of comparing two run-length encoded strings

被引:7
作者
Chen, Kuan-Yu [1 ]
Hsu, Ping-Hui [1 ]
Chao, Kun-Mao [1 ,2 ,3 ]
机构
[1] Natl Taiwan Univ, Dept Comp Sci & Informat Engn, Taipei 106, Taiwan
[2] Natl Taiwan Univ, Grad Inst Biomed Elect & Bioinformat, Taipei 106, Taiwan
[3] Natl Taiwan Univ, Grad Inst Networking & Multimedia, Taipei 106, Taiwan
关键词
Compressed pattern matching; Run-length encoding; Sequence comparison; DISTANCE;
D O I
10.1016/j.jco.2010.03.003
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we consider a commonly used compression scheme called run-length encoding. We provide both lower and upper bounds for the problems of comparing two run-length encoded strings. Specifically, we prove the 3sum-hardness for both the wildcard matching problem and the k-mismatch problem with run-length compressed inputs. Given two run-length encoded strings of m and n runs, such a result implies that it is very unlikely to devise an o(mn)-time algorithm for either of them. We then present an inplace algorithm running in O(mn log m) time for their combined problem, i.e. k-mismatch with wildcards. We further demonstrate that if the aim is to report the positions of all the occurrences, there exists a stronger barrier of Omega (mn log m)time, matching the running time of our algorithm. Moreover, our algorithm can be easily generalized to a two-dimensional setting without impairing the time and space complexity. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:364 / 374
页数:11
相关论文
共 20 条
[1]   GENERALIZED STRING MATCHING [J].
ABRAHAMSON, K .
SIAM JOURNAL ON COMPUTING, 1987, 16 (06) :1039-1051
[2]   Faster algorithms for string matching with k mismatches [J].
Amir, A ;
Lewenstein, M ;
Porat, E .
JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2004, 50 (02) :257-275
[3]   Inplace run-length 2d compressed search [J].
Amir, A ;
Landau, GM ;
Sokol, D .
THEORETICAL COMPUTER SCIENCE, 2003, 290 (03) :1361-1383
[4]  
AMIR A, 1992, EFFICIENT 2 DIMENSIO, P279
[5]   Matching for run-length encoded strings [J].
Apostolico, A ;
Landau, GM ;
Skiena, S .
JOURNAL OF COMPLEXITY, 1999, 15 (01) :4-16
[6]   Polygon containment and translational min-Hausdorff-distance between segment sets are 3SUM-hard [J].
Barequet, G ;
Har-Peled, S .
INTERNATIONAL JOURNAL OF COMPUTATIONAL GEOMETRY & APPLICATIONS, 2001, 11 (04) :465-474
[7]   Simple deterministic wildcard matching [J].
Clifford, Peter ;
Clifford, Raphael .
INFORMATION PROCESSING LETTERS, 2007, 101 (02) :53-54
[8]  
CLIFFORD R, 2006, FAST RANDOMISED MAXI, P150
[9]  
CLIFFORD R, 2009, CODING THEORY EFFICI, P778
[10]   A subquadratic sequence alignment algorithm for unrestricted scoring matrices [J].
Crochemore, M ;
Landau, GM ;
Ziv-Ukelson, M .
SIAM JOURNAL ON COMPUTING, 2003, 32 (06) :1654-1673