Association rule hiding based on evolutionary multi-objective optimization

被引:19
作者
Cheng, Peng [1 ,4 ]
Lee, Ivan [2 ]
Lin, Chun-Wei [1 ]
Pan, Jeng-Shyang [1 ,3 ]
机构
[1] Harbin Inst Technol, Shenzhen Grad Sch, Shenzhen, Guangdong, Peoples R China
[2] Univ S Australia, Sch IT & Math Sci, Adelaide, SA 5001, Australia
[3] Fujian Univ Technol, Coll Informat Sci & Engn, Fuzhou, Fujian, Peoples R China
[4] Southwest Univ, Sch Comp & Informat Sci, Chongqing, Peoples R China
关键词
Privacy preserving data mining; association rule hiding; evolutionary multi-objective optimization; EMO; ALGORITHMS;
D O I
10.3233/IDA-160817
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When data mining techniques are applied to discover useful knowledge behind a large data collection, they are often required to preserve some confidential information, such as sensitive frequent itemsets, rules and so on. A feasible way to ensure the confidentiality is to sanitize the database and conceal sensitive information. However, the sanitization process often produces side effects, thus minimizing these side effects is an important task. An important but ignored fact is that a tradeoff exists within different side effects. When attempting to improve the performance on one dimension, the performance on other dimensions often will be degraded. In this paper, we focus on privacy preserving in association rule mining. Since there is a tradeoff within different side effects, we tried to minimize them from the view of multi-objective optimization. A rule hiding approach based on evolutionary multi-objective optimization (EMO) is proposed. It hides sensitive rules through removing identified items. The side effects on missing non-sensitive rules, ghost rules and data loss are formulated as optimization objectives. EMO is utilized to find a suitable subset of transactions for modification so that side effects can be minimized. Experimental results on real datasets illustrate that the proposed approach can achieve satisfactory results with fewer side effects. In addition, the EMO-based approach can produce multiple hiding solutions in a single run. It provides the opportunity for a user to choose freely the preferred one by preference or experience.
引用
收藏
页码:495 / 514
页数:20
相关论文
共 35 条
[1]  
Agrawal R., P 20 INT C VERY LARG
[2]   Dare to share: Protecting sensitive knowledge with data sanitization [J].
Amiri, Ali .
DECISION SUPPORT SYSTEMS, 2007, 43 (01) :181-191
[3]  
[Anonymous], 2003, FIMI
[4]  
[Anonymous], 2007, EVOLUTIONARY ALGORIT
[5]  
[Anonymous], 2005, OSDM'05: Proceedings of the 1st international workshop on open source data mining, DOI DOI 10.1145/1133905.1133913
[6]  
[Anonymous], 1999, KDEX WORKSH, DOI [10.1109/KDEX.1999.836532, DOI 10.1109/KDEX.1999.836532]
[7]   HypE: An Algorithm for Fast Hypervolume-Based Many-Objective Optimization [J].
Bader, Johannes ;
Zitzler, Eckart .
EVOLUTIONARY COMPUTATION, 2011, 19 (01) :45-76
[8]  
Bayardo R. J. Jr., 1998, SIGMOD Record, V27, P85, DOI 10.1145/276305.276313
[9]  
Bechikh S., 2010, P 2010 ACM S APPL CO, P1118
[10]   A framework for evaluating privacy preserving data mining algorithms [J].
Bertino, E ;
Fovino, IN ;
Provenza, LP .
DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 11 (02) :121-154