A STRING PATTERN-MATCHING ALGORITHM

被引:0
作者
CHANG, DK [1 ]
机构
[1] CUNY GRAD SCH & UNIV CTR,NEW YORK,NY 10036
关键词
D O I
10.1016/0164-1212(93)90111-A
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A string pattern-matching algorithm uses a character string, pattern, to search another character string, text, for the first or all occurrence(s) of the pattern in the text. This article presents a string pattern-matching algorithm using a mapping table and an automaton. The number of states of the automaton is equal to the length of the pattern and its number of inputs is one more than the number of different characters in the pattern. During the search, when a mismatch occurs between a pattern character and a text character, the jump to the next position for comparison is made optimal based on the information in the automaton. For non case-sensitive pattern matching, the translation of cases is all done in the mapping table in linear time of the sum of the pattern length and the size of the alphabet; the search part does not do any translation. Experimental results confirm that the ratio of the number of characters inspected over the number of characters passed is better than that of the Boyer-Moore algorithm. The search loop is also very simple and machine independent. Therefore, the algorithm should appeal to anyone who needs a universal search routine that does not depend on statistics of characters.
引用
收藏
页码:207 / 216
页数:10
相关论文
共 7 条
[1]   EFFICIENT STRING MATCHING - AID TO BIBLIOGRAPHIC SEARCH [J].
AHO, AV ;
CORASICK, MJ .
COMMUNICATIONS OF THE ACM, 1975, 18 (06) :333-340
[2]  
AHO AV, 1974, DESIGN ANAL COMPUTER, P318
[3]   FAST STRING SEARCHING ALGORITHM [J].
BOYER, RS ;
MOORE, JS .
COMMUNICATIONS OF THE ACM, 1977, 20 (10) :762-772
[4]  
CHANG DK, 1991, J SYST SOFTWARE, V15, P223
[5]   A NEW PROOF OF THE LINEARITY OF THE BOYER-MOORE STRING SEARCHING ALGORITHM [J].
GUIBAS, LJ ;
ODLYZKO, AM .
SIAM JOURNAL ON COMPUTING, 1980, 9 (04) :672-682
[6]  
Knuth D. E., 1977, SIAM Journal on Computing, V6, P323, DOI 10.1137/0206024
[7]  
[No title captured]