Reverse-Safe Data Structures for Text Indexing

被引:0
|
作者
Bernardini, Giulia [1 ,2 ]
Chen, Huiping [3 ]
Fici, Gabriele [4 ]
Loukides, Grigorios [3 ]
Pissis, Solon P. [2 ]
机构
[1] Univ Milano Bicocca, Milan, Italy
[2] CWI, Amsterdam, Netherlands
[3] Kings Coll London, London, England
[4] Univ Palermo, Palermo, Italy
来源
2020 PROCEEDINGS OF THE SYMPOSIUM ON ALGORITHM ENGINEERING AND EXPERIMENTS, ALENEX | 2020年
关键词
COMMON SUBSTRING APPROACH; ABELIAN EQUIVALENCE; ANONYMIZATION; CONSTRUCTION; SPACE;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z-reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D. The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z, we propose an algorithm which constructs a z-reverse-safe data structure that has size O (n) and answers pattern matching queries of length at most d optimally, where d is maximal for any such z-reverse-safe data structure. The construction algorithm takes O(n! log d) time, where ! is the matrix multiplication exponent. We show that, despite the n! factor, our engineered implementation takes only a few minutes to finish for million-letter texts. We further show that plugging our method in data analysis applications gives insignificant or no data utility loss. Finally, we show how our technique can be extended to support applications under a realistic adversary model.
引用
收藏
页码:199 / 213
页数:15
相关论文
共 50 条
  • [1] Reverse-Safe Text Indexing
    Bernardini G.
    Chen H.
    Fici G.
    Loukides G.
    Pissis S.P.
    1600, Association for Computing Machinery (26):
  • [2] Reverse-safe authentication protocol for secure USB memories
    Lee, Kyungroul
    Yim, Kangbin
    Spafford, Eugene H.
    SECURITY AND COMMUNICATION NETWORKS, 2012, 5 (08) : 834 - 845
  • [3] Analysis of Indexing Structures for Immutable Data
    Yue, Cong
    Xie, Zhongle
    Zhang, Meihui
    Chen, Gang
    Ooi, Beng Chin
    Wang, Sheng
    Xiao, Xiaokui
    SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2020, : 925 - 935
  • [4] Semantic structures for video data indexing
    Zettsu, K
    Uehara, K
    Tanaka, K
    ADVANCED MULTIMEDIA CONTENT PROCESSING, 1999, 1554 : 356 - 369
  • [5] Suffix Trays and Suffix Trists: Structures for Faster Text Indexing
    Richard Cole
    Tsvi Kopelowitz
    Moshe Lewenstein
    Algorithmica, 2015, 72 : 450 - 466
  • [6] Combining machine learning and hierarchical indexing structures for text categorization
    Ruiz, ME
    Srinivasan, P
    ADVANCES IN CLASSIFICATION RESEARCH, VOL 10, 2001, : 107 - 124
  • [7] Suffix trays and suffix trists: Structures for faster text indexing
    Cole, Richard
    Kopelowitz, Tsvi
    Lewenstein, Moshe
    AUTOMATA, LANGUAGES AND PROGRAMMING, PT 1, 2006, 4051 : 358 - 369
  • [8] Suffix Trays and Suffix Trists: Structures for Faster Text Indexing
    Cole, Richard
    Kopelowitz, Tsvi
    Lewenstein, Moshe
    ALGORITHMICA, 2015, 72 (02) : 450 - 466
  • [9] On effective conceptual indexing and similarity search in text data
    Aggarwal, CC
    Yu, PS
    2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, : 3 - 10
  • [10] Indexing data structures for faster volume rendering
    Ihm, I
    Lee, RK
    COMPUTERS & GRAPHICS, 1997, 21 (04) : 497 - 506