Compressed Indexing with Signature Grammars

被引:8
|
作者
Christiansen, Anders Roy [1 ]
Ettienne, Mikko Berggren [1 ]
机构
[1] Tech Univ Denmark, Lyngby, Denmark
来源
LATIN 2018: THEORETICAL INFORMATICS | 2018年 / 10807卷
关键词
D O I
10.1007/978-3-319-77404-6_25
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The compressed indexing problem is to preprocess a string S of length n into a compressed representation that supports pattern matching queries. That is, given a string P of length m report all occurrences of P in S. We present a data structure that supports pattern matching queries in O(m + occ(lg lg n + lg(epsilon) z)) time using O(z lg(n/z)) space where z is the size of the LZ77 parse of S and epsilon > 0 is an arbitrarily small constant, when the alphabet is small or z = O(n(1-delta)) for any constant delta > 0. We also present two data structures for the general case; one where the space is increased by O(z lg lg z), and one where the query time changes from worst-case to expected. These results improve the previously best known solutions. Notably, this is the first data structure that decides if P occurs in S in O(m) time using O(z lg(n/z)) space. Our results are mainly obtained by a novel combination of a randomized grammar construction algorithm with well known techniques relating pattern matching to 2D-range reporting.
引用
收藏
页码:331 / 345
页数:15
相关论文
共 50 条
  • [1] Indexing compressed text
    Ferragina, P
    Manzini, G
    JOURNAL OF THE ACM, 2005, 52 (04) : 552 - 581
  • [2] Compressed domain indexing of losslessly compressed images
    Schaefer, G
    STORAGE AND RETRIEVAL FOR MEDIA DATABASES 2002, 2002, 4676 : 79 - 85
  • [3] Compressed text indexing with wildcards
    Hon, Wing-Kai
    Ku, Tsung-Han
    Shah, Rahul
    Thankachan, Sharma V.
    Vitter, Jeffrey Scott
    JOURNAL OF DISCRETE ALGORITHMS, 2013, 19 (19) : 23 - 29
  • [4] String Indexing with Compressed Patterns
    Bille, Philip
    Gortz, Inge Li
    Steiner, Teresa Anna
    ACM TRANSACTIONS ON ALGORITHMS, 2023, 19 (04)
  • [5] Compressed Text Indexing with Wildcards
    Hon, Wing-Kai
    Ku, Tsung-Han
    Shah, Rahul
    Thankachan, Sharma V.
    Vitter, Jeffrey Scott
    STRING PROCESSING AND INFORMATION RETRIEVAL, 2011, 7024 : 267 - +
  • [6] String Indexing with Compressed Patterns
    Bille, Philip
    Gortz, Inge Li
    Steiner, Teresa Anna
    37TH INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE (STACS 2020), 2020, 154
  • [7] Indexing of compressed video sequences
    Idris, F
    Panchanathan, S
    STORAGE AND RETRIEVAL FOR STILL IMAGE AND VIDEO DATABASES IV, 1996, 2670 : 247 - 253
  • [8] Universal compressed text indexing
    Navarro, Gonzalo
    Prezza, Nicola
    THEORETICAL COMPUTER SCIENCE, 2019, 762 : 41 - 50
  • [9] A Sanitizing Signature Scheme with Indexing
    Koide, Atsushi
    Tso, Raylin
    Okamoto, Eiji
    EUC 2008: PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING, VOL 2, WORKSHOPS, 2008, : 16 - +
  • [10] Video indexing in the wavelet compressed domain
    Panchanathan, S
    Mandal, MK
    Aboulnasr, T
    1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 3, 1998, : 546 - 550