Key-value data collection and statistical analysis with local differential privacy

被引:1
作者
Zhu, Hui [1 ]
Tang, Xiaohu [1 ]
Yang, Laurence Tianruo [2 ,3 ,4 ]
Fu, Chao [5 ]
Peng, Shuangrong [1 ]
机构
[1] Southwest Jiaotong Univ, Sch Informat Sci & Technol, Chengdu, Peoples R China
[2] Hainan Univ, Sch Comp Sci & Technol, Haikou, Peoples R China
[3] St Francis Xavier Univ, Dept Comp Sci, Antigonish, NS, Canada
[4] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China
[5] Southwest Jiaotong Univ, Sch Math, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Key-value data; Local differential privacy; Mean estimation; Frequency estimation; RANGE QUERIES;
D O I
10.1016/j.ins.2023.119058
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The collection and statistical analysis of simple data types (e.g., categorical, numerical and multi-dimensional data) under local differential privacy has been widely studied. Recently, researchers have focused on the collection of the key-value data, which is one of the main types of NoSQL data model. In the collection and statistical analysis of key-value data under local differential privacy, the frequency and mean of each key must be estimated simultaneously. However, achieving a good utility-privacy tradeoff is difficult, because key-value data has inherent correlation, and some users may have different numbers of key-value pairs. In this paper, we propose an efficient sampling based scheme for collecting and analyzing key-value data. Note that the more valid data collected, the higher the accuracy of statistical data under the same disturbance level and disturbance algorithm. Therefore, we make full use of probability sampling and the inherent correlation of key-value data to improve the probability of users submitting valid key-value data. Moreover, we optimize the budget allocation on key-value data, so that the overall variance of frequency and mean estimation is close to optimal. Detailed theoretical analysis and experimental results show that the proposed scheme is superior to existing schemes in accuracy.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] A Comprehensive Survey on Local Differential Privacy toward Data Statistics and Analysis
    Wang, Teng
    Zhang, Xuefeng
    Feng, Jingyu
    Yang, Xinyu
    SENSORS, 2020, 20 (24) : 1 - 48
  • [32] Oblivious Statistic Collection With Local Differential Privacy in Mutual Distrust
    Sasada, Taisho
    Taenaka, Yuzo
    Kadobayashi, Youki
    IEEE ACCESS, 2023, 11 : 21374 - 21386
  • [33] Personalized sampling graph collection with local differential privacy for link prediction
    Jiang, Linyu
    Yan, Yukun
    Tian, Zhihong
    Xiong, Zuobin
    Han, Qilong
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (05): : 2669 - 2689
  • [34] Personalized sampling graph collection with local differential privacy for link prediction
    Linyu Jiang
    Yukun Yan
    Zhihong Tian
    Zuobin Xiong
    Qilong Han
    World Wide Web, 2023, 26 : 2669 - 2689
  • [35] Multidimensional categorical data collection under shuffled differential privacy
    Wang, Ning
    Zhuang, Jian
    Wang, Zhigang
    Wei, Zhiqiang
    Gu, Yu
    Tang, Peng
    Yu, Ge
    COMPUTERS & SECURITY, 2025, 151
  • [36] Collecting and Analyzing Multidimensional Data with Local Differential Privacy
    Wang, Ning
    Xiao, Xiaokui
    Yang, Yin
    Zhao, Jun
    Hui, Siu Cheung
    Shin, Hyejin
    Shin, Junbum
    Yu, Ge
    2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 638 - 649
  • [37] Beyond Value Perturbation: Local Differential Privacy in the Temporal Setting
    Ye, Qingqing
    Hu, Haibo
    Li, Ninghui
    Meng, Xiaofeng
    Zheng, Huadi
    Yan, Haotian
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
  • [38] Privacy-preserving mechanism for mixed data clustering with local differential privacy
    Yuan, Liujie
    Zhang, Shaobo
    Zhu, Gengming
    Alinani, Karim
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (19)
  • [39] LDPGuard: Defenses Against Data Poisoning Attacks to Local Differential Privacy Protocols
    Huang, Kai
    Ouyang, Gaoya
    Ye, Qingqing
    Hu, Haibo
    Zheng, Bolong
    Zhao, Xi
    Zhang, Ruiyuan
    Zhou, Xiaofang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (07) : 3195 - 3209
  • [40] Hierarchical Aggregation for Numerical Data under Local Differential Privacy
    Hao, Mingchao
    Wu, Wanqing
    Wan, Yuan
    SENSORS, 2023, 23 (03)