The applicability of the perturbation based privacy preserving data mining for real-world data

被引:46
作者
Liu, Li [1 ]
Kantarcioglu, Murat [1 ]
Thuraisingham, Bhavani [1 ]
机构
[1] Univ Texas Dallas, Dept Comp Sci, Richardson, TX 75080 USA
关键词
data mining; privacy; security;
D O I
10.1016/j.datak.2007.06.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The perturbation method has been extensively studied for privacy preserving data mining. In this method, random noise from a known distribution is added to the privacy sensitive data before the data is sent to the data miner. Subsequently, the data miner reconstructs an approximation to the original data distribution from the perturbed data and uses the reconstructed distribution for data mining purposes. Due to the addition of noise, loss of information versus preservation of privacy is always a trade off in the perturbation based approaches. The question is, to what extent are the users willing to compromise their privacy? This is a choice that changes from individual to individual. Different individuals may have different attitudes towards privacy based on custom's and cultures. Unfortunately, current perturbation based privacy preserving data mining techniques do not allow the individuals to choose their desired privacy levels. This is a drawback as privacy is a personal choice. In this paper, we propose an individually adaptable perturbation model, which enables the individuals to choose their own privacy levels. The effectiveness of our new approach is demonstrated by various experiments conducted on both synthetic and real-world data sets. Based on our experiments, we suggest a simple but effective and yet efficient technique to build data mining models from perturbed data. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:5 / 21
页数:17
相关论文
共 23 条
[1]  
AGRAWAL D, 2001, PODS ACM
[2]  
AGRAWAL R, 2000, SIGMOD C, P439, DOI DOI 10.1145/342009.335438
[3]  
[Anonymous], 2002, Proceedings of The Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, DOI DOI 10.1145/775047.775080
[4]  
[Anonymous], 2005, Data Mining Pratical Machine Learning Tools and Techniques
[5]  
[Anonymous], 2003, PROC 9 ACM SIGKDD IN
[6]  
[Anonymous], 1997, Machine Learning
[7]  
Chen KK, 2005, Fifth IEEE International Conference on Data Mining, Proceedings, P589
[8]  
Clifton C., 2002, SIGKDD EXPLOR NEWSLE, V4, P28, DOI [10.1145/772862.772867, DOI 10.1145/772862.772867]
[9]  
Cranor Lorrie Faith, CONCERN UNDERSTANDIN
[10]  
Evfimievski A, 2003, P 22 ACM SIGMOD SIGA, P211, DOI DOI 10.1145/773153.773174