Internal Dictionary Matching

被引:0
|
作者
Panagiotis Charalampopoulos
Tomasz Kociumaka
Manal Mohamed
Jakub Radoszewski
Wojciech Rytter
Tomasz Waleń
机构
[1] The Interdisciplinary Center Herzliya,
[2] University of California,undefined
[3] University of Warsaw,undefined
[4] Samsung R&D,undefined
来源
Algorithmica | 2021年 / 83卷
关键词
Dictionary matching; Internal pattern matching; Range searching; Dynamic dictionary;
D O I
暂无
中图分类号
学科分类号
摘要
We introduce data structures answering queries concerning the occurrences of patterns from a given dictionary D\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {D}$$\end{document} in fragments of a given string T of length n. The dictionary is internal in the sense that each pattern in D\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {D}$$\end{document} is given as a fragment of T. This way, D\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {D}$$\end{document} takes space proportional to the number of patterns d=|D|\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d=|\mathsf {D}|$$\end{document} rather than their total length, which could be Θ(n·d)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varTheta (n\cdot d)$$\end{document}. In particular, we consider the following types of queries: reporting and counting all occurrences of patterns from D\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {D}$$\end{document} in a fragment T[i..j]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T[i \mathinner {.\,.}j]$$\end{document} and reporting distinct patterns from D\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {D}$$\end{document} that occur in T[i..j]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T[i \mathinner {.\,.}j]$$\end{document}. We show how to construct, in O((n+d)logO(1)n)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O((n+d) \log ^{O(1)} n)$$\end{document} time, a data structure that answers each of these queries in time O(logO(1)n+|output|)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(\log ^{O(1)} n+| output |)$$\end{document}. The case of counting patterns is much more involved and needs a combination of a locally consistent parsing with orthogonal range searching. Reporting distinct patterns, on the other hand, uses the structure of maximal repetitions in strings. Finally, we provide tight—up to subpolynomial factors—upper and lower bounds for the case of a dynamic dictionary.
引用
收藏
页码:2142 / 2169
页数:27
相关论文
共 50 条
  • [1] Internal Dictionary Matching
    Charalampopoulos, Panagiotis
    Kociumaka, Tomasz
    Mohamed, Manal
    Radoszewski, Jakub
    Rytter, Wojciech
    Walen, Tomasz
    ALGORITHMICA, 2021, 83 (07) : 2142 - 2169
  • [2] DYNAMIC DICTIONARY MATCHING
    AMIR, A
    FARACH, M
    GALIL, Z
    GIANCARLO, R
    PARK, K
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1994, 49 (02) : 208 - 222
  • [3] Dictionary Matching in a Stream
    Clifford, Raphael
    Fontaine, Allyx
    Porat, Ely
    Sach, Benjamin
    Starikovskaya, Tatiana
    ALGORITHMS - ESA 2015, 2015, 9294 : 361 - 372
  • [4] Dictionary matching with a few gaps
    Amir, Amihood
    Levy, Avivit
    Porat, Ely
    Shalom, B. Riva
    THEORETICAL COMPUTER SCIENCE, 2015, 589 : 34 - 46
  • [5] Compressed automata for dictionary matching
    I, Tomohiro
    Nishimoto, Takaaki
    Inenaga, Shunsuke
    Bannai, Hideo
    Takeda, Masayuki
    THEORETICAL COMPUTER SCIENCE, 2015, 578 : 30 - 41
  • [6] Faster compressed dictionary matching
    Hon, Wing-Kai
    Ku, Tsung-Han
    Shah, Rahul
    Thankachan, Sharma V.
    Vitter, Jeffrey Scott
    THEORETICAL COMPUTER SCIENCE, 2013, 475 : 113 - 119
  • [7] IMPROVED DYNAMIC DICTIONARY MATCHING
    AMIR, A
    FARACH, M
    IDURY, RM
    LAPOUTRE, JA
    SCHAFFER, AA
    INFORMATION AND COMPUTATION, 1995, 119 (02) : 258 - 282
  • [8] Compressed index for dictionary matching
    Hon, Wing-Kai
    Lam, Tak-Wah
    Shah, Rahul
    Tam, Siu-Lung
    Vitter, Jeffrey Scott
    DCC: 2008 DATA COMPRESSION CONFERENCE, PROCEEDINGS, 2008, : 23 - +
  • [9] Succinct Dictionary Matching with No Slowdown
    Belazzougui, Djamal
    COMBINATORIAL PATTERN MATCHING, PROCEEDINGS, 2010, 6129 : 88 - 100
  • [10] Matching pursuits with undercomplete dictionary
    Wang, Dang-wei
    Ma, Xiao-yan
    ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 2255 - 2259