Collaborative Sampling for Partial Multi-Dimensional Value Collection Under Local Differential Privacy

被引：3

作者：

Qian, Qiuyu ^{[1
,2
]}

Ye, Qingqing ^{[1
]}

Hu, Haibo ^{[1
]}

Huang, Kai ^{[3
]}

Chan, Tom Tak-Lam ^{[2
]}

Li, Jin ^{[4
]}

机构：

[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Peoples R China

[2] Ctr Adv Reliabil & Safety CAiRS, Hong Kong, Peoples R China

[3] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China

[4] Guangzhou Univ, Inst Artificial Intelligence & Blockchain, Guangzhou 510006, Peoples R China

来源：

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY | 2023年 / 18卷

基金：

中国国家自然科学基金;

关键词：

Local differential privacy; collaborative sam-pling; privacy-preserving data collection; multi-dimensional data;

D O I：

10.1109/TIFS.2023.3289007

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In big data era, companies and organizations are keen to collect data from users and analyse their behaviour patterns to make decisions or predictions for profits. However, it undermines users' privacy because the collected data can be quite sensitive and easy to leak. To address privacy problems, local differential privacy (LDP) has been proposed for untrusted data collectors to obtain statistical information without compromising user privacy. Most studies on LDP assume that all users fully cooperate and contribute to the data collection process and thus the collected dataset is complete. However, in practice, especially when user population is large, such assumption seldom holds due to communication loss, user unresponsiveness or unwillingness, and incomplete user-side data. Unfortunately, state-of-the-art LDP-based data collection schemes, such as GRR, OUE and OLH, cannot handle partial data collection effectively. In this paper, we propose collaborative sampling to address partial data collection in a multi-dimensional setting. Thanks to a two-phase mechanism, we can derive the optimal sampling rate for each dimension. The optimality is shown and proved with respect to the variance of estimated frequency. Besides that, collaborative sampling is general and can be used in GRR, OUE and OLH with minimal adaption. Through experimental results, we show collaborative sampling outperforms existing mainstream data collection schemes in partial multi-dimensional data collection.

引用

页码：3948 / 3961

页数：14

共 49 条

[11] From t-closeness to differential privacy and vice versa in data anonymization
Domingo-Ferrer, Josep
Soria-Comas, Jordi
[J]. KNOWLEDGE-BASED SYSTEMS, 2015, 74 : 151 - 158
[12] Optimization of cultivation strategy and medium for bacteriocin activity of Enterococcus faecium HDX-2
Du, Renpeng
Pei, Fangyi
Kang, Jie
Zhang, Wen
Ping, Wenxiang
Ling, Hongzhi
Ge, Jingping
[J]. PREPARATIVE BIOCHEMISTRY & BIOTECHNOLOGY, 2022, 52 (07) : 762 - 769
[13] Utility Analysis and Enhancement of LDP Mechanisms in High-Dimensional Space
Duan, Jiawei
Ye, Qingqing
Hu, Haibo
[J]. 2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 407 - 419
[14] Duchi JC, 2018, J AM STAT ASSOC, V113, P182, DOI 10.1080/01621459.2017.1389735
[15] Local Privacy and Statistical Minimax Rates
Duchi, John C.
Jordan, Michael I.
Wainwright, Martin J.
[J]. 2013 IEEE 54TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2013, : 429 - 438
[16] Dwork C., 2006, PROC 33 INT C AUTOMA, P1
[17] Differential privacy: A survey of results
Dwork, Cynthia
[J]. THEORY AND APPLICATIONS OF MODELS OF COMPUTATION, PROCEEDINGS, 2008, 4978 : 1 - 19
[18] Calibrating noise to sensitivity in private data analysis
Dwork, Cynthia
McSherry, Frank
Nissim, Kobbi
Smith, Adam
[J]. THEORY OF CRYPTOGRAPHY, PROCEEDINGS, 2006, 3876 : 265 - 284
[19] The Algorithmic Foundations of Differential Privacy
Dwork, Cynthia
Roth, Aaron
[J]. FOUNDATIONS AND TRENDS IN THEORETICAL COMPUTER SCIENCE, 2013, 9 (3-4): : 211 - 406
[20] RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response
Erlingsson, Ulfar
Pihur, Vasyl
Korolova, Aleksandra
[J]. CCS'14: PROCEEDINGS OF THE 21ST ACM CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2014, : 1054 - 1067

← 1 2 3 4 5 →