Improved Differentially Private Euclidean Distance Approximation

被引:6
作者
Stausholm, Nina Mesing [1 ,2 ]
机构
[1] IT Univ Copenhagen, Copenhagen, Denmark
[2] BARC, Copenhagen, Denmark
来源
PODS '21: PROCEEDINGS OF THE 40TH SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS | 2021年
关键词
differential privacy; dimensionality reduction; Johnson-Lindenstrauss; JOHNSON-LINDENSTRAUSS; NOISE;
D O I
10.1145/3452021.3458328
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This work shows how to privately and more accurately estimate Euclidean distance between pairs of vectors. Input vectors x and y are mapped to differentially private sketches x' and y', from which one can estimate the distance between x and y. Our estimator relies on the Sparser Johnson-Lindenstrauss constructions by Kane & Nelson (Journal of the ACM 2014), which for any 0 < a, beta < 1/2 have optimal output dimension k = Theta(alpha(-2) log(1/beta)) and sparsity s = O(alpha(-1)log(1/beta)). We combine the constructions of Kane & Nelson with either the Laplace or the Gaussian mechanism from the differential privacy literature, depending on the privacy parameters epsilon and delta. We also suggest a differentially private version of Fast Johnson-Lindenstrauss Transform (FJLT) by Ailon & Chazelle (SIAM Journal of Computing 2009) which offers a tradeoff in speed for variance for certain parameters. We answer an open question by Kenthapadi et al. (Journal of Privacy and Confidentiality 2013) by analyzing the privacy and utility guarantees of an estimator for Euclidean distance, relying on Laplacian rather than Gaussian noise. We prove that the Laplace mechanism yields lower variance than the Gaussian mechanism whenever delta < beta(O(1/alpha)()). Thus, our work poses an improvement over the work of Kenthapadi et al. by giving a more efficient estimator with lower variance for sufficiently small delta. Our sketch also achieves pure differential privacy as a neat side-effect of the Laplace mechanism rather than the approximate differential privacy guarantee of the Gaussian mechanism, which may not be sufficiently strong for some settings. Our main result is a special case of more general, technical results proving that one can generally construct unbiased estimators for Euclidean distance with a high level of utility even under the constraint of differential privacy. The bulk of our analysis is proving that the variance of the estimator does not suffer too much in the presence of differential privacy.
引用
收藏
页码:42 / 56
页数:15
相关论文
共 48 条
[1]   Database-friendly random projections: Johnson-Lindenstrauss with binary coins [J].
Achlioptas, D .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2003, 66 (04) :671-687
[2]   THE FAST JOHNSON-LINDENSTRAUSS TRANSFORM AND APPROXIMATE NEAREST NEIGHBORS [J].
Ailon, Nir ;
Chazelle, Bernard .
SIAM JOURNAL ON COMPUTING, 2009, 39 (01) :302-322
[3]  
[Anonymous], 1984, C MODERN ANAL PROBAB
[4]  
Barbaro MichaelTom Zeller Jr., 2006, A Face Is Exposed for AOL Searcher No. 4417749
[5]  
Bhaskar R, 2011, LECT NOTES COMPUT SC, V7073, P215, DOI 10.1007/978-3-642-25385-0_12
[6]   The Johnson-Lindenstrauss Transform Itself Preserves Differential Privacy [J].
Blocki, Jeremiah ;
Blum, Avrim ;
Datta, Anupam ;
Sheffet, Or .
2012 IEEE 53RD ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2012, :410-419
[7]   Randomized Dimensionality Reduction for k-Means Clustering [J].
Boutsidis, Christos ;
Zouzias, Anastasios ;
Mahoney, Michael W. ;
Drineas, Petros .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2015, 61 (02) :1045-1062
[8]  
Bringmann K, 2014, LECT NOTES COMPUT SC, V8572, P247
[9]  
Canonne CL, 2024, Arxiv, DOI arXiv:2004.00010
[10]   Differentially Private High-Dimensional Data Publication via Sampling-Based Inference [J].
Chen, Rui ;
Xiao, Qian ;
Zhang, Yu ;
Xu, Jianliang .
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, :129-138