Suffix array for multi-pattern matching with variable length wildcards

被引:2
|
作者
Liu, Na [1 ,2 ,3 ]
Xie, Fei [4 ]
Wu, Xindong [1 ,2 ,5 ]
机构
[1] Hefei Univ Technol, Minist Educ, Key Lab Knowledge Engn Big Data, Hefei, Anhui, Peoples R China
[2] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei, Anhui, Peoples R China
[3] North Minzu Univ, Sch Comp Sci & Engn, Yinchuan, Ningxia, Peoples R China
[4] Hefei Normal Univ, Dept Comp Sci & Technol, Hefei 230009, Anhui, Peoples R China
[5] Mininglamp Technol, Mininglamp Acad Sci, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Pattern matching; suffix array; wildcards; ALGORITHM; CONSTRUCTION;
D O I
10.3233/IDA-205087
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Approximate multi-pattern matching is an important issue that is widely and frequently utilized, when the pattern contains variable-length wildcards. In this paper, two suffix array-based algorithms have been proposed to solve this problem. Suffix array is an efficient data structure for exact string matching in existing studies, as well as for approximate pattern matching and multi-pattern matching. An algorithm called MMSA-S is for the short exact characters in a pattern by dynamic programming, while another algorithm called MMSA-L deals with the long exact characters by the edit distance method. Experimental results of Pizza & Chili corpus demonstrate that these two newly proposed algorithms, in most cases, are more time-efficient than the state-of-the-art comparison algorithms.
引用
收藏
页码:283 / 303
页数:21
相关论文
共 50 条
  • [1] Multi-pattern matching with variable-length wildcards using suffix tree
    Liu, Na
    Xie, Fei
    Wu, Xindong
    PATTERN ANALYSIS AND APPLICATIONS, 2018, 21 (04) : 1151 - 1165
  • [2] Multi-pattern matching with variable-length wildcards using suffix tree
    Na Liu
    Fei Xie
    Xindong Wu
    Pattern Analysis and Applications, 2018, 21 : 1151 - 1165
  • [3] Multi-pattern matching with wildcards
    Zhang M.
    Zhang Y.
    Tang J.
    Bai X.
    Journal of Software, 2011, 6 (12 SPEC. ISSUE) : 2391 - 2398
  • [4] A novel optimal multi-pattern matching method with wildcards for DNA sequence
    Wang, Xinlu
    Saif, Ahmed A. F.
    Liu, Dayou
    Zhu, Yungang
    Benediktsson, Jon Atli
    TECHNOLOGY AND HEALTH CARE, 2021, 29 : S115 - S124
  • [5] Pattern Matching with Wildcards based on Multiple Suffix Trees
    Liu, Yingling
    Wu, Xindong
    Hu, Xuegang
    Gao, Jun
    Wang, Chi
    2012 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC 2012), 2012, : 320 - 325
  • [6] Pattern matching with wildcards using words of shorter length
    Zhang, Meng
    Zhang, Yi
    Hu, Liang
    INFORMATION PROCESSING LETTERS, 2010, 110 (24) : 1099 - 1102
  • [7] Pattern Matching with Flexible Wildcards
    Wu, Xindong
    Qiang, Ji-Peng
    Xie, Fei
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2014, 29 (05) : 740 - 750
  • [8] Indexed Multi-pattern Matching
    Gagie, Travis
    Karhu, Kalle
    Karkkainen, Juha
    Makinen, Veli
    Salmela, Leena
    Tarhio, Jorma
    LATIN 2012: THEORETICAL INFORMATICS, 2012, 7256 : 399 - 407
  • [9] Pattern Matching with Flexible Wildcards
    Xindong Wu
    Ji-Peng Qiang
    Fei Xie
    Journal of Computer Science and Technology, 2014, 29 : 740 - 750
  • [10] Pattern Matching with Flexible Wildcards
    吴信东
    强继朋
    谢飞
    JournalofComputerScience&Technology, 2014, 29 (05) : 740 - 751