Reverse-Safe Text Indexing

被引:0
|
作者
Bernardini G. [1 ]
Chen H. [2 ]
Fici G. [3 ]
Loukides G. [2 ]
Pissis S.P. [4 ]
机构
[1] Bernardini, Giulia
[2] Chen, Huiping
[3] Fici, Gabriele
[4] Loukides, Grigorios
[5] Pissis, Solon P.
来源
| 1600年 / Association for Computing Machinery卷 / 26期
关键词
data privacy; Data structures; pattern matching; suffix tree; text indexing;
D O I
10.1145/3461698
中图分类号
学科分类号
摘要
We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z-reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D. The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z, we propose an algorithm that constructs a z-reverse-safe data structure (z-RSDS) that has size O(n) and answers decision and counting pattern matching queries of length at most d optimally, where d is maximal for any such z-RSDS. The construction algorithm takes O(nI• log d) time, where I• is the matrix multiplication exponent. We show that, despite the nI• factor, our engineered implementation takes only a few minutes to finish for million-letter texts. We also show that plugging our method in data analysis applications gives insignificant or no data utility loss. Furthermore, we show how our technique can be extended to support applications under realistic adversary models. Finally, we show a z-RSDS for decision pattern matching queries, whose size can be sublinear in n. A preliminary version of this article appeared in ALENEX 2020. © 2021 Owner/Author.
引用
收藏
相关论文
共 50 条
  • [1] Reverse-Safe Data Structures for Text Indexing
    Bernardini, Giulia
    Chen, Huiping
    Fici, Gabriele
    Loukides, Grigorios
    Pissis, Solon P.
    2020 PROCEEDINGS OF THE SYMPOSIUM ON ALGORITHM ENGINEERING AND EXPERIMENTS, ALENEX, 2020, : 199 - 213
  • [2] Reverse-safe authentication protocol for secure USB memories
    Lee, Kyungroul
    Yim, Kangbin
    Spafford, Eugene H.
    SECURITY AND COMMUNICATION NETWORKS, 2012, 5 (08) : 834 - 845
  • [3] Text indexing with errors
    Maass, Moritz G.
    Nowak, Johannes
    JOURNAL OF DISCRETE ALGORITHMS, 2007, 5 (04) : 662 - 681
  • [4] Text indexing with errors
    Maass, MG
    Nowak, J
    COMBINATORIAL PATTERN MATCHING, PROCEEDINGS, 2005, 3537 : 21 - 32
  • [5] SEMANTIC TEXT INDEXING
    Kaleta, Zbigniew
    COMPUTER SCIENCE-AGH, 2014, 15 (01): : 19 - 34
  • [6] Text Mining for Indexing
    Gelernter, Judith
    Lesk, Michael
    JCDL 09: PROCEEDINGS OF THE 2009 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, 2009, : 467 - 467
  • [7] Indexing compressed text
    Ferragina, P
    Manzini, G
    JOURNAL OF THE ACM, 2005, 52 (04) : 552 - 581
  • [8] COMPARISON OF PHILOSOPHY OF INDEXING TEXT WITH THAT OF INDEXING STRUCTURAL FORMULAE
    KENT, AK
    ASLIB PROCEEDINGS, 1967, 19 (11): : 364 - &
  • [9] Succincter Text Indexing with Wildcards
    Thachuk, Chris
    COMBINATORIAL PATTERN MATCHING, 22ND ANNUAL SYMPOSIUM, CPM 2011, 2011, 6661 : 27 - 40
  • [10] The gold text indexing engine
    Barbara, D
    Mehrotra, S
    Vallabhaneni, P
    PROCEEDINGS OF THE TWELFTH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, 1996, : 172 - 179