Slicing: A New Approach for Privacy Preserving Data Publishing

被引:146
作者
Li, Tiancheng [1 ]
Li, Ninghui [1 ]
Zhang, Jian [2 ]
Molloy, Ian [1 ]
机构
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[2] Purdue Univ, Dept Stat, W Lafayette, IN 47907 USA
关键词
Privacy preservation; data anonymization; data publishing; data security; BACKGROUND KNOWLEDGE; K-ANONYMITY;
D O I
10.1109/TKDE.2010.236
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several anonymization techniques, such as generalization and bucketization, have been designed for privacy preserving microdata publishing. Recent work has shown that generalization loses considerable amount of information, especially for high-dimensional data. Bucketization, on the other hand, does not prevent membership disclosure and does not apply for data that do not have a clear separation between quasi-identifying attributes and sensitive attributes. In this paper, we present a novel technique called slicing, which partitions the data both horizontally and vertically. We show that slicing preserves better data utility than generalization and can be used for membership disclosure protection. Another important advantage of slicing is that it can handle high-dimensional data. We show how slicing can be used for attribute disclosure protection and develop an efficient algorithm for computing the sliced data that obey the l-diversity requirement. Our workload experiments confirm that slicing preserves better utility than generalization and is more effective than bucketization in workloads involving the sensitive attribute. Our experiments also demonstrate that slicing can be used to prevent membership disclosure.
引用
收藏
页码:561 / 574
页数:14
相关论文
共 35 条
[1]  
[Anonymous], 2006, P 32 INT C VER LARG
[2]  
[Anonymous], 2005, P 2005 ACM SIGMOD IN
[3]  
[Anonymous], 2005, VLDB, DOI DOI 10.5555/1083592.1083696
[4]  
[Anonymous], 2008, P 14 ACM SIGKDD INT, DOI DOI 10.1145/1401890.1401904
[5]  
[Anonymous], 2003, P 22 ACM SIGMOD SIGA
[6]  
Blum Avrim, 2005, P 24 ACM SIGMOD SIGA, P128, DOI [DOI 10.1145/1065167.1065184, 10.1145/1065167.1065184]
[7]  
Chen B.-C., 2007, Proceedings of the 33rd international conference on Very large data bases, P770
[8]  
Cramt'er H., 1948, MATH METHODS STAT
[9]   Differential privacy: A survey of results [J].
Dwork, Cynthia .
THEORY AND APPLICATIONS OF MODELS OF COMPUTATION, PROCEEDINGS, 2008, 4978 :1-19
[10]  
Dwork C, 2006, LECT NOTES COMPUT SC, V4052, P1