Multiple pattern matching in LZW compressed text

被引:33
作者
Kida, T [1 ]
Takeda, M [1 ]
Shinohara, A [1 ]
Miyazaki, M [1 ]
Arikawa, S [1 ]
机构
[1] Kyushu Univ 33, Dept Informat, Fukuoka 8128581, Japan
来源
DCC '98 - DATA COMPRESSION CONFERENCE | 1998年
关键词
D O I
10.1109/DCC.1998.672136
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we address the problem of searching in LZW compressed text directly, and present a new algorithm for finding multiple patterns by simulating the move of the Aho-Corasick pattern matching machine. The new algorithm finds all occurrences of multiple patterns whereas the algorithm proposed by Amir, Benson, and Farach finds only the first occurrence of a single pattern. The new algorithm runs in O(n + m(2) + r) time using O(n + m(2)) space, where n is the length of the compressed text, m is the length of the total length of the patterns, and r is the number of occurrences of the patterns. We implemented a simple version of the algorithm, and showed that it is approximately twice faster than a decompression followed by a search using the Aho-Corasick machine.
引用
收藏
页码:103 / 112
页数:10
相关论文
empty
未找到相关数据