Efficient systematic clustering method for k-anonymization

被引:42
作者
Kabir, Md. Enamul [3 ]
Wang, Hua [3 ]
Bertino, Elisa [1 ,2 ]
机构
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[2] Purdue Univ, CERIAS, W Lafayette, IN 47907 USA
[3] Univ So Queensland, Dept Math & Comp, Toowoomba, Qld 4350, Australia
关键词
ANONYMITY; PRIVACY;
D O I
10.1007/s00236-010-0131-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a clustering (Clustering partitions record into clusters such that records within a cluster are similar to each other, while records in different clusters are most distinct from one another.) based k-anonymization technique to minimize the information loss while at the same time assuring data quality. Privacy preservation of individuals has drawn considerable interests in data mining research. The k-anonymity model proposed by Samarati and Sweeney is a practical approach for data privacy preservation and has been studied extensively for the last few years. Anonymization methods via generalization or suppression are able to protect private information, but lose valued information. The challenge is how to minimize the information loss during the anonymization process. We refer to the challenge as a systematic clustering problem for k-anonymization which is analysed in this paper. The proposed technique adopts group-similar data together and then anonymizes each group individually. The structure of systematic clustering problem is defined and investigated through paradigm and properties. An algorithm of the proposed problem is developed and shown that the time complexity is in O(n(2)/k), where n is the total number of records containing individuals concerning their privacy. Experimental results show that our method attains a reasonable dominance with respect to both information loss and execution time. Finally the algorithm illustrates the usability for incremental datasets.
引用
收藏
页码:51 / 66
页数:16
相关论文
共 25 条
[1]  
[Anonymous], 2006, 22 INT C DAT ENG WOR, DOI DOI 10.1109/ICDEW.2006.116
[2]  
[Anonymous], P 7 AUSTR DAT MIN C
[3]  
[Anonymous], 2006, P 12 ACM SIGKDD INT
[4]  
Bayardo R. J., 2005, INT C DAT ENG
[5]  
Byun J.-W., 2006, 3 VLDB WORKSH SEC DA
[6]  
Byun J.-W., 2007, INT C DAT SYST ADV A
[7]  
Byun JW, 2006, SIGMOD RECORD, V35, P9, DOI 10.1145/1121995.1121997
[8]  
Chiu C.-C., 2007, 3 INT C ADV DAT MIN
[9]  
Ciriani V, 2008, ADV DATABASE SYST, V34, P105
[10]  
Fung B.C.M., 2005, INT C DAT ENG