An efficient sanitization algorithm for balancing information privacy and knowledge discovery in association patterns mining

被引:13
作者
Wang, En Tzu [1 ]
Lee, Guanling [2 ]
机构
[1] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu 30043, Taiwan
[2] Natl Dong Hwa Univ, Dept Comp Sci & Informat Engn, Hualien, Taiwan
关键词
data mining; frequent pattern; sensitive pattern; sanitization process; probability policy;
D O I
10.1016/j.datak.2007.12.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discovering frequent patterns in large databases is one of the most studied problems in data mining, since it can yield substantial commercial benefits. However, some sensitive patterns with security considerations may compromise privacy. In this paper, we aim to determine appropriate balance between need for privacy and information discovery in frequent patterns. A novel method to modify databases for hiding sensitive patterns is proposed in this paper. Multiplying the original database by a sanitization matrix yields a sanitized database with private content. In addition, two probabilities are introduced to oppose against the recovery of sensitive patterns and to reduce the degree of hiding non-sensitive patterns in the sanitized database. The complexity analysis and the security discussion of the proposed sanitization process are provided. The results from a series of experiments performed to show the efficiency and effectiveness of this approach are described. (C) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:463 / 484
页数:22
相关论文
共 24 条
  • [1] Agrawal R., 1994, Proceedings of the 20th International Conference on Very Large Data Bases. VLDB'94, P487
  • [2] [Anonymous], 2002, Proceedings of The Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, DOI DOI 10.1145/775047.775080
  • [3] [Anonymous], ACM SIGKDD EXPL NEWS
  • [4] [Anonymous], 2000, Privacy-preserving data mining, DOI DOI 10.1145/342009.335438
  • [5] Atallah M., 1999, PROC 1999 WORKSHOP K, P45, DOI DOI 10.1109/KDEX.1999.836532
  • [6] Blocking anonymity threats raised by frequent itemset mining
    Atzori, M
    Bonchi, F
    Giannotti, F
    Pedreschi, D
    [J]. Fifth IEEE International Conference on Data Mining, Proceedings, 2005, : 561 - 564
  • [7] Atzori M, 2005, LECT NOTES ARTIF INT, V3721, P10
  • [8] Evfimievski A, 2003, P 22 ACM SIGMOD SIGA, P211, DOI DOI 10.1145/773153.773174
  • [9] Privacy-preserving distributed mining of association rules on horizontally partitioned data
    Kantarcioglu, M
    Clifton, C
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (09) : 1026 - 1037
  • [10] LEE G, 2004, P 28 ANN INT COMP SO, P424