Differential privacy preserving clustering using Daubechies-2 wavelet transform

被引:1
作者
Dishabi, Mohammad Reza Ebrahimi [1 ]
Azgomi, Mohammad Abdollahi [2 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, Miyaneh Branch, Miyaneh, Iran
[2] Iran Univ Sci & Technol, Sch Comp Engn, Tehran 1684613114, Iran
关键词
Data mining; privacy preserving clustering (PPC); differential privacy; data dimensionality; Daubechies-2 wavelet transform (D2WT);
D O I
10.1142/S0219691315500289
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Most of the existing privacy preserving clustering (PPC) algorithms do not consider the worst case privacy guarantees and are based on heuristic notions. In addition, these algorithms do not run efficiently in the case of high dimensionality of data. In this paper, to alleviate these challenges, we propose a new PPC algorithm, which is based on Daubechies-2 wavelet transform (D2WT) and preserves the differential privacy notion. Differential privacy is the strong notion of privacy, which provides the worst case privacy guarantees. On the other hand, most of the existing differential-based PPC algorithms generate data with poor utility. If we apply differential privacy properties over the original raw data, the resulting data will offer lower quality of clustering (QOC) during the clustering analysis. Therefore, we use D2WT for the preprocessing of the original data before adding noise to the data. By applying D2WT to the original data, the resulting data not only contains lower dimension compared to the original data, but also can provide differential privacy guarantee with high QOC due to less noise addition. The proposed algorithm has been implemented and experimented over some well-known datasets. We also compare the proposed algorithm with some recently introduced algorithms based on utility and privacy degrees.
引用
收藏
页数:35
相关论文
共 42 条
[1]  
Aggarwal C. C., 2008, SERIES ADV DATABASE, V34
[2]   FRAPP: a framework for high-accuracy privacy-preserving mining [J].
Agrawal, Shipra ;
Haritsa, Jayant R. ;
Prakash, B. Aditya .
DATA MINING AND KNOWLEDGE DISCOVERY, 2009, 18 (01) :101-139
[3]  
[Anonymous], 2013, ARXIV13070475
[4]  
[Anonymous], 2012, ARXIV12042606
[5]  
Bingham E., 2001, KDD-2001. Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P245, DOI 10.1145/502512.502546
[6]  
Blum Avrim, 2005, P 24 ACM SIGMOD SIGA, P128, DOI [DOI 10.1145/1065167.1065184, 10.1145/1065167.1065184]
[7]  
Boggess A., 2001, A First Course in Wavelets with Fourier Analysis
[8]  
Chui C., 1992, An introduction to wavelets, V1, DOI [10.2307/2153134, DOI 10.2307/2153134]
[9]  
Clifton C., 2003, P 9 ACM SIGKDD INT C, P206, DOI DOI 10.1145/956755.956776
[10]  
Cui YJ, 2009, LECT NOTES COMPUT SC, V5463, P456