A transversal hypergraph approach for the frequent itemset hiding problem

被引:16
作者
Stavropoulos, Elias C. [1 ,2 ]
Verykios, Vassilios S. [1 ,3 ]
Kagklis, Vasileios [1 ,4 ]
机构
[1] Hellen Open Univ, Educ Content Methodol & Technol Lab, 278 Patron Claus Str, Patras 26335, Greece
[2] Technol Educ Inst Western Greece, Dept Business Adm, Patras 26334, Greece
[3] Hellen Open Univ, Sch Sci & Technol, Patras 26335, Greece
[4] Univ Patras, Comp Engn & Informat Dept, Patras 26504, Greece
关键词
Privacy-preserving data mining; Hiding frequent itemsets; Transversal hypergraph generation; K-ANONYMITY; MONOTONE; DUALIZATION; COMPLEXITY; KNOWLEDGE; SETS;
D O I
10.1007/s10115-015-0862-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a methodology for hiding all sensitive frequent itemsets in a transaction database. Our methodology relies on a novel technique that enumerates the minimal transversals of a hypergraph in order to induce the ideal border between frequent and sensitive itemsets. The ideal border is then utilized to formulate an integer linear program (ILP) that answers whether a feasible sanitized database that attains the ideal border, exists. The solution of the program identifies the set of transactions that need to be modified (sanitized) so that the hiding can be achieved with the maximum accuracy. If no solution exists, we modify the ILP by relaxing the constraints needed to be satisfied so that the sanitized database preserves the privacy with guarantee but with minimum effect in data quality. Experimental evaluation of the proposed approach on a number of real datasets has shown that the produced sanitized databases exhibit higher accuracy when compared with the solutions of other well-known approaches.
引用
收藏
页码:625 / 645
页数:21
相关论文
共 53 条
[1]  
AGGARWAL CC, 2008, ADV DATABASE SYSTEMS
[2]  
Agrawal R, 2000, SIGMOD REC, V29, P439, DOI 10.1145/335191.335438
[3]  
Agrawal R., P 20 INT C VERY LARG
[4]  
[Anonymous], 1979, Computers and Intractablity: A Guide to the Theory of NP-Completeness
[5]  
[Anonymous], IBM ILOG CPLEX US MA
[6]  
[Anonymous], 1999, KDEX WORKSH, DOI [10.1109/KDEX.1999.836532, DOI 10.1109/KDEX.1999.836532]
[7]  
Bailey J, 2003, THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, P485
[8]  
Bayardo R. J. Jr., 1998, SIGMOD Record, V27, P85, DOI 10.1145/276305.276313
[9]  
Berge C., 1989, N HOLLAND MATH LIB, V45
[10]  
Bodon F., 2003, P IEEE ICDM WORKSH F, V90, P56