A Retrieval Method of Similar Strings Using Substrings

被引:1
作者
Fuketa, Masao [1 ]
Fujisawa, Nobuo [1 ]
Bando, Hiroaki [1 ]
Morita, Kazuhiro [1 ]
Aoe, Jun-ichi [1 ]
机构
[1] Univ Tokushima, Dept Informat Sci & Intelligent Syst, Tokushima, Japan
来源
2010 SECOND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATIONS: ICCEA 2010, PROCEEDINGS, VOL 2 | 2010年
关键词
similar string; edit distance; substring;
D O I
10.1109/ICCEA.2010.210
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the rapid growth of the Internet, demand for retrieving similar strings to an input string has been increasing. Edit distance is helpful to retrieve necessary information from the large amount of data; edit distance is the similarity of two strings. Moreover, an input string must be compared with all strings in a dictionary in order to retrieve similar strings by computing edit distance, and this process is very time-consuming work. This paper proposes a structure of a dictionary for retrieving similar strings effectively and a method to output edit distance of strings in descending order at high speed. This method can retrieve all similar strings and the speed of this method is faster than that of a n-gram method.
引用
收藏
页码:300 / 304
页数:5
相关论文
共 6 条
[1]  
Aoe J., 1991, COMPUTER ALGORITHMS
[2]   Structural optimization of a full-text n-gram index using relational normalization [J].
Kim, Min-Soo ;
Whang, Kyu-Young ;
Lee, Jae-Gil ;
Lee, Min-Jae .
VLDB JOURNAL, 2008, 17 (06) :1485-1507
[3]  
Knuth Donald E., 1998, The Art of Computer Programming, V3
[4]  
Levenshtein V.I., 1966, Soviet Physics Doklady
[5]  
Mayfield J., 2003, Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, P415, DOI [10.1145/860435.860528, DOI 10.1145/860435.860528]
[6]   An O(ND) Difference Algorithm and Its Variations [J].
Myers, Eugene W. .
ALGORITHMICA, 1986, 1 (1-4) :251-266