Differentially private nearest neighbor classification

被引:15
作者
Gursoy, Mehmet Emre [1 ]
Inan, Ali [2 ]
Nergiz, Mehmet Ercan [3 ]
Saygin, Yucel [4 ]
机构
[1] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA
[2] Adana Sci & Technol Univ, Comp Engn Dept, Adana, Turkey
[3] Acadsoft Res, Gaziantep, Turkey
[4] Sabanci Univ, Fac Engn & Nat Sci, Istanbul, Turkey
关键词
Data mining; Differential privacy; k-Nearest neighbors; ALGORITHMS; FRAMEWORK;
D O I
10.1007/s10618-017-0532-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Instance-based learning, and the k-nearest neighbors algorithm (k-NN) in particular, provide simple yet effective classification algorithms for data mining. Classifiers are often executed on sensitive information such as medical or personal data. Differential privacy has recently emerged as the accepted standard for privacy protection in sensitive data. However, straightforward applications of differential privacy to k-NN classification yield rather inaccurate results. Motivated by this, we develop algorithms to increase the accuracy of private instance-based classification. We first describe the radius neighbors classifier (r-N) and show that its accuracy under differential privacy can be greatly improved by a non-trivial sensitivity analysis. Then, for k-NN classification, we build algorithms that convert k-NN classifiers to r-N classifiers. We experimentally evaluate the accuracy of both classifiers using various datasets. Experiments show that our proposed classifiers significantly outperform baseline private classifiers (i.e., straightforward applications of differential privacy) and executing the classifiers on a dataset published using differential privacy. In addition, the accuracy of our proposed k-NN classifiers are at least comparable to, and in many cases better than, the other differentially private machine learning techniques.
引用
收藏
页码:1544 / 1575
页数:32
相关论文
共 56 条
[1]   Deep Learning with Differential Privacy [J].
Abadi, Martin ;
Chu, Andy ;
Goodfellow, Ian ;
McMahan, H. Brendan ;
Mironov, Ilya ;
Talwar, Kunal ;
Zhang, Li .
CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, :308-318
[2]  
Aggarwal CC, 2014, DATA CLASSIFICATION, P157
[3]  
Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
[4]  
Behley J, 2015, IEEE INT CONF ROBOT, P3625, DOI 10.1109/ICRA.2015.7139702
[5]  
Bentley Jon L., 1975, Technical Report
[6]  
Bojarski M, 2014, ARXIV14106973
[7]  
Chaudhuri K., 2009, P ADV NEUR INF PROC, V22, P289
[8]  
Chaudhuri K, 2013, J MACH LEARN RES, V14, P2905
[9]   Differentially Private Spatial Decompositions [J].
Cormode, Graham ;
Procopiuc, Cecilia ;
Srivastava, Divesh ;
Shen, Entong ;
Yu, Ting .
2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, :20-31
[10]   Unsupervised learning with normalised data and non-Euclidean norms [J].
Doherty, K. A. J. ;
Adams, R. G. ;
Davey, N. .
APPLIED SOFT COMPUTING, 2007, 7 (01) :203-210