Collaborative Sampling for Partial Multi-Dimensional Value Collection Under Local Differential Privacy

被引:3
作者
Qian, Qiuyu [1 ,2 ]
Ye, Qingqing [1 ]
Hu, Haibo [1 ]
Huang, Kai [3 ]
Chan, Tom Tak-Lam [2 ]
Li, Jin [4 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Peoples R China
[2] Ctr Adv Reliabil & Safety CAiRS, Hong Kong, Peoples R China
[3] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[4] Guangzhou Univ, Inst Artificial Intelligence & Blockchain, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
Local differential privacy; collaborative sam-pling; privacy-preserving data collection; multi-dimensional data;
D O I
10.1109/TIFS.2023.3289007
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In big data era, companies and organizations are keen to collect data from users and analyse their behaviour patterns to make decisions or predictions for profits. However, it undermines users' privacy because the collected data can be quite sensitive and easy to leak. To address privacy problems, local differential privacy (LDP) has been proposed for untrusted data collectors to obtain statistical information without compromising user privacy. Most studies on LDP assume that all users fully cooperate and contribute to the data collection process and thus the collected dataset is complete. However, in practice, especially when user population is large, such assumption seldom holds due to communication loss, user unresponsiveness or unwillingness, and incomplete user-side data. Unfortunately, state-of-the-art LDP-based data collection schemes, such as GRR, OUE and OLH, cannot handle partial data collection effectively. In this paper, we propose collaborative sampling to address partial data collection in a multi-dimensional setting. Thanks to a two-phase mechanism, we can derive the optimal sampling rate for each dimension. The optimality is shown and proved with respect to the variance of estimated frequency. Besides that, collaborative sampling is general and can be used in GRR, OUE and OLH with minimal adaption. Through experimental results, we show collaborative sampling outperforms existing mainstream data collection schemes in partial multi-dimensional data collection.
引用
收藏
页码:3948 / 3961
页数:14
相关论文
共 49 条
  • [11] From t-closeness to differential privacy and vice versa in data anonymization
    Domingo-Ferrer, Josep
    Soria-Comas, Jordi
    [J]. KNOWLEDGE-BASED SYSTEMS, 2015, 74 : 151 - 158
  • [12] Optimization of cultivation strategy and medium for bacteriocin activity of Enterococcus faecium HDX-2
    Du, Renpeng
    Pei, Fangyi
    Kang, Jie
    Zhang, Wen
    Ping, Wenxiang
    Ling, Hongzhi
    Ge, Jingping
    [J]. PREPARATIVE BIOCHEMISTRY & BIOTECHNOLOGY, 2022, 52 (07) : 762 - 769
  • [13] Utility Analysis and Enhancement of LDP Mechanisms in High-Dimensional Space
    Duan, Jiawei
    Ye, Qingqing
    Hu, Haibo
    [J]. 2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 407 - 419
  • [14] Duchi JC, 2018, J AM STAT ASSOC, V113, P182, DOI 10.1080/01621459.2017.1389735
  • [15] Local Privacy and Statistical Minimax Rates
    Duchi, John C.
    Jordan, Michael I.
    Wainwright, Martin J.
    [J]. 2013 IEEE 54TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2013, : 429 - 438
  • [16] Dwork C., 2006, PROC 33 INT C AUTOMA, P1
  • [17] Differential privacy: A survey of results
    Dwork, Cynthia
    [J]. THEORY AND APPLICATIONS OF MODELS OF COMPUTATION, PROCEEDINGS, 2008, 4978 : 1 - 19
  • [18] Calibrating noise to sensitivity in private data analysis
    Dwork, Cynthia
    McSherry, Frank
    Nissim, Kobbi
    Smith, Adam
    [J]. THEORY OF CRYPTOGRAPHY, PROCEEDINGS, 2006, 3876 : 265 - 284
  • [19] The Algorithmic Foundations of Differential Privacy
    Dwork, Cynthia
    Roth, Aaron
    [J]. FOUNDATIONS AND TRENDS IN THEORETICAL COMPUTER SCIENCE, 2013, 9 (3-4): : 211 - 406
  • [20] RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response
    Erlingsson, Ulfar
    Pihur, Vasyl
    Korolova, Aleksandra
    [J]. CCS'14: PROCEEDINGS OF THE 21ST ACM CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2014, : 1054 - 1067