Deconvoluting kernel density estimation and regression for locally differentially private data

被引:9
作者
Farokhi, Farhad [1 ]
机构
[1] Univ Melbourne, Dept Elect & Elect Engn, Parkville, Vic 3010, Australia
关键词
ERRORS-IN-VARIABLES; NONPARAMETRIC REGRESSION; RANDOMIZED-RESPONSE; CHOICE;
D O I
10.1038/s41598-020-78323-0
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Local differential privacy has become the gold-standard of privacy literature for gathering or releasing sensitive individual data points in a privacy-preserving manner. However, locally differential data can twist the probability density of the data because of the additive noise used to ensure privacy. In fact, the density of privacy-preserving data (no matter how many samples we gather) is always flatter in comparison with the density function of the original data points due to convolution with privacy-preserving noise density function. The effect is especially more pronounced when using slow-decaying privacy-preserving noises, such as the Laplace noise. This can result in under/over-estimation of the heavy-hitters. This is an important challenge facing social scientists due to the use of differential privacy in the 2020 Census in the United States. In this paper, we develop density estimation methods using smoothing kernels. We use the framework of deconvoluting kernel density estimators to remove the effect of privacy-preserving noise. This approach also allows us to adapt the results from non-parametric regression with errors-in-variables to develop regression models based on locally differentially private data. We demonstrate the performance of the developed methods on financial and demographic datasets.
引用
收藏
页数:11
相关论文
共 52 条
[1]   Deep Learning with Differential Privacy [J].
Abadi, Martin ;
Chu, Andy ;
Goodfellow, Ian ;
McMahan, H. Brendan ;
Mironov, Ilya ;
Talwar, Kunal ;
Zhang, Li .
CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, :308-318
[2]   The US Census Bureau Adopts Differential Privacy [J].
Abowd, John M. .
KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, :2867-2867
[3]  
Acharya J, 2019, 22 INT C ARTIFICIAL, V89
[4]   Local, Private, Efficient Protocols for Succinct Histograms [J].
Bassily, Raef ;
Smith, Adam .
STOC'15: PROCEEDINGS OF THE 2015 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2015, :127-135
[5]   Revisiting the governance of privacy: Contemporary policy instruments in global perspective [J].
Bennett, Colin J. ;
Raab, Charles D. .
REGULATION & GOVERNANCE, 2020, 14 (03) :447-464
[6]   Design and Analysis of the Randomized Response Technique [J].
Blair, Graeme ;
Imai, Kosuke ;
Zhou, Yang-Yang .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2015, 110 (511) :1304-1319
[7]  
BORUCH RF, 1971, AM SOCIOL, V6, P308
[8]   OPTIMAL RATES OF CONVERGENCE FOR DECONVOLVING A DENSITY [J].
CARROLL, RJ ;
HALL, P .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1988, 83 (404) :1184-1186
[9]  
Czarnitzki D., 2007, APPL FINANCIAL EC, V17, P1061, DOI DOI 10.1080/09603100600749220
[10]   Practical bandwidth selection in deconvolution kernel density estimation [J].
Delaigle, A ;
Gijbels, I .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2004, 45 (02) :249-267