Differentially Private Analysis of Outliers

被引:8
作者
Okada, Rina [1 ]
Fukuchi, Kazuto [1 ]
Sakuma, Jun [1 ]
机构
[1] Univ Tsukuba, Tsukuba, Ibaraki 3058577, Japan
来源
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2015, PT II | 2015年 / 9285卷
关键词
Differential privacy; Outlier detection; Smooth sensitivity; NOISE;
D O I
10.1007/978-3-319-23525-7_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an investigation of differentially private analysis of distance-based outliers. Outlier detection aims to identify instances that are apparently distant from other instances. Meanwhile, the objective of differential privacy is to conceal the presence (or absence) of any particular instance. Outlier detection and privacy protection are therefore intrinsically conflicting tasks. In this paper, we present differentially private queries for counting outliers that appear in a given subspace, instead of reporting the outliers detected. Our analysis of the global sensitivity of outlier counts reveals that regular global sensitivity-based methods can make the outputs too noisy, particularly when the dimensionality of the given subspace is high. Noting that the counts of outliers are typically expected to be small compared to the number of data, we introduce a mechanism based on the smooth upper bound of the local sensitivity. This study is the first trial to ensure differential privacy for distance-based outlier analysis. The experimentally obtained results show that our method achieves better utility than global sensitivity-based methods do.
引用
收藏
页码:458 / 473
页数:16
相关论文
共 22 条
[1]  
Dwork C., 2010, J. Priv. Confid, V1, DOI DOI 10.29012/JPC.V1I2.570
[2]  
Dwork C, 2006, LECT NOTES COMPUT SC, V4004, P486
[3]   Calibrating noise to sensitivity in private data analysis [J].
Dwork, Cynthia ;
McSherry, Frank ;
Nissim, Kobbi ;
Smith, Adam .
THEORY OF CRYPTOGRAPHY, PROCEEDINGS, 2006, 3876 :265-284
[4]   Differentially Private Anomaly Detection with a Case Study on Epidemic Outbreak Detection [J].
Fan, Liyue ;
Xiong, Li .
2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2013, :833-840
[5]   Fast smallest-enclosing-ball computation in high dimensions [J].
Fischer, K ;
Gärtner, B ;
Kutz, M .
ALGORITHMS - ESA 2003, PROCEEDINGS, 2003, 2832 :630-641
[6]   Flexible and Adaptive Subspace Search for Outlier Analysis [J].
Keller, Fabian ;
Mueller, Emmanuel ;
Wixler, Andreas ;
Boehm, Klemens .
PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, :1381-1390
[7]   HiCS: High Contrast Subspaces for Density-Based Outlier Ranking [J].
Keller, Fabian ;
Mueller, Emmanuel ;
Beohm, Klemens .
2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, :1037-1048
[8]  
Knorr E. M., 1998, Proceedings of the Twenty-Fourth International Conference on Very-Large Databases, P392
[9]  
Knorr EM, 1999, PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, P211
[10]  
Kutz M., JAVA LIB COMPUTE MIN