An Efficient Big Data Anonymization Algorithm Based on Chaos and Perturbation Techniques

被引:43
作者
Eyupoglu, Can [1 ]
Aydin, Muhammed Ali [2 ]
Zaim, Abdul Halim [1 ]
Sertbas, Ahmet [2 ]
机构
[1] Istanbul Commerce Univ, Dept Comp Engn, TR-34840 Istanbul, Turkey
[2] Istanbul Univ, Dept Comp Engn, TR-34320 Istanbul, Turkey
关键词
big data; chaos; data anonymization; data perturbation; privacy preserving; DIFFERENTIAL PRIVACY; K-ANONYMITY; SECURITY; INTERNET; FRAMEWORK;
D O I
10.3390/e20050373
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
The topic of big data has attracted increasing interest in recent years. The emergence of big data leads to new difficulties in terms of protection models used for data privacy, which is of necessity for sharing and processing data. Protecting individuals' sensitive information while maintaining the usability of the data set published is the most important challenge in privacy preserving. In this regard, data anonymization methods are utilized in order to protect data against identity disclosure and linking attacks. In this study, a novel data anonymization algorithm based on chaos and perturbation has been proposed for privacy and utility preserving in big data. The performance of the proposed algorithm is evaluated in terms of Kullback-Leibler divergence, probabilistic anonymity, classification accuracy, F-measure and execution time. The experimental results have shown that the proposed algorithm is efficient and performs better in terms of Kullback-Leibler divergence, classification accuracy and F-measure compared to most of the existing algorithms using the same data set. Resulting from applying chaos to perturb data, such successful algorithm is promising to be used in privacy preserving data mining and data publishing.
引用
收藏
页数:18
相关论文
共 72 条
  • [1] Aggarwal CC, 2008, ADV DATABASE SYST, V34, P1, DOI 10.1007/978-0-387-70992-5
  • [2] Agrawal D., 2001, Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS '01), P247, DOI [10.1145/375551.375602, DOI 10.1145/375551.375602]
  • [3] Agrawal R, 2000, SIGMOD REC, V29, P439, DOI 10.1145/335191.335438
  • [4] [Anonymous], 2011, BIG DATA NEXT FRONTI
  • [5] [Anonymous], 2005, P 2005 ACM SIGMOD IN
  • [6] [Anonymous], 2017, IEEE T BIG DATA
  • [7] Ardagna C., 2014, P NSF WORKSHOP BIG D, P1
  • [8] Bamford J., 2012, Wired
  • [9] Big Data security and privacy: A review
    Bardi, Matturdi
    Zhou Xianwei
    Li Shuai
    Lin Fuhong
    [J]. CHINA COMMUNICATIONS, 2014, 11 (02) : 135 - 145
  • [10] Chen KK, 2007, PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, P78