Collecting High-Dimensional and Correlation-Constrained Data with Local Differential Privacy

被引:21
作者
Du, Rong [1 ]
Ye, Qingqing [1 ]
Fu, Yue [1 ]
Hu, Haibo [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Peoples R China
来源
2021 18TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON SENSING, COMMUNICATION, AND NETWORKING (SECON) | 2021年
基金
中国国家自然科学基金;
关键词
ANSWERING RANGE QUERIES; DATA PUBLICATION;
D O I
10.1109/SECON52354.2021.9491591
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Local differential privacy (LDP) is a promising privacy model for distributed data collection. It has been widely deployed in real-world systems (e.g. Chrome, iOS, macOS). In LDP-based mechanisms, an aggregator collects private values perturbed by each user and then analyses these values to estimate their statistics, such as frequency and mean. Most existing works focus on simple scalar value types, such as hoolean and categorical values. However, with the emergence of smart sensors and internet of things, high-dimensional data are gaining increasing popularity. In many cases, correlations exist between various attributes of such data, e.g. temperature and luminance. To ensure LDP for high-dimensional data, existing solutions either partition the privacy budget c among these correlated attributes or adopt sampling, both of which dilute the density of useful information and thus result in poor data utility. In this paper, we propose a relaxed LDP model, namely, univariate dominance local differential privacy (UDLDP), for high-dimensional data. We quantify the correlations between attributtes and present a correlation-bounded perturbation (CBP) mechanism that optimizes the partitioning of privacy budget on each correlated attribute. Furthermore, we extend CBP to support sampling, which is a common bandwidth reduction technique in sensor networks and Internet of Things. We derive the best allocation strategy of sampling probabilities among attributes in terms of data utility, which leads to the correlation-bounded perturbation mechanism with sampling (CBPS). The performance of both mechanisms is evaluated and compared with state-of-the-art LDP mechanisms on real-world and synthetic datasets.
引用
收藏
页数:9
相关论文
共 36 条
[11]   Differential privacy: A survey of results [J].
Dwork, Cynthia .
THEORY AND APPLICATIONS OF MODELS OF COMPUTATION, PROCEEDINGS, 2008, 4978 :1-19
[12]   Calibrating noise to sensitivity in private data analysis [J].
Dwork, Cynthia ;
McSherry, Frank ;
Nissim, Kobbi ;
Smith, Adam .
THEORY OF CRYPTOGRAPHY, PROCEEDINGS, 2006, 3876 :265-284
[13]   RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response [J].
Erlingsson, Ulfar ;
Pihur, Vasyl ;
Korolova, Aleksandra .
CCS'14: PROCEEDINGS OF THE 21ST ACM CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2014, :1054-1067
[14]   The Staircase Mechanism in Differential Privacy [J].
Geng, Quan ;
Kairouz, Peter ;
Oh, Sewoong ;
Viswanath, Pramod .
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2015, 9 (07) :1176-1184
[15]  
Gu XL, 2020, PROCEEDINGS OF THE 29TH USENIX SECURITY SYMPOSIUM, P967
[16]  
Kairouz P, 2014, ADV NEUR IN, V27
[17]  
Kairouz P, 2016, PR MACH LEARN RES, V48
[18]   WHAT CAN WE LEARN PRIVATELY? [J].
Kasiviswanathan, Shiva Prasad ;
Lee, Homin K. ;
Nissim, Kobbi ;
Raskhodnikova, Sofya ;
Smith, Adam .
SIAM JOURNAL ON COMPUTING, 2011, 40 (03) :793-826
[19]   Answering Range Queries Under Local Differential Privacy [J].
Kulkarni, Tejas .
SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, :1832-1834
[20]  
Li N., 2016, Synthesis Lectures on Information Security, Privacy, & Trust, V8, P1, DOI [10.2200/S00735ED1V01Y201609SPT018, DOI 10.2200/S00735ED1V01Y201609SPT018]