Automated k-Anonymization and l-Diversity for Shared Data Privacy

被引:4
作者
Kayem, Anne V. D. M. [1 ,2 ]
Vester, C. T. [1 ]
Meinel, Christoph [2 ]
机构
[1] Univ Cape Town, Dept Comp Sci, ZA-7701 Cape Town, South Africa
[2] Hasso Plattner Inst, Potsdam, Germany
来源
DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2016, PT I | 2016年 / 9827卷
关键词
Automated data anonymization; Multi-objective optimization; k-anonymity; l-diversity; Data outsourcing; ANONYMITY;
D O I
10.1007/978-3-319-44403-1_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Analyzing data is a cost-intensive process, particularly for organizations lacking the necessary in-house human and computational capital. Data analytics outsourcing offers a cost-effective solution, but data sensitivity and query response time requirements, make data protection a necessary pre-processing step. For performance and privacy reasons, anonymization is preferred over encryption. Yet, manual anonymization is time-intensive and error-prone. Automated anonymization is a better alternative but requires satisfying the conflicting objectives of utility and privacy. In this paper, we present an automated anonymization scheme that extends the standard k-anonymization and l-diversity algorithms to satisfy the dual objectives of data utility and privacy. We use a multi-objective optimization scheme that employs a weighting mechanism, to minimise information loss and maximize privacy. Our results show that automating l-diversity results in an added average information loss of 7% over automated k-anonymization, but in a diversity of between 9-14% in comparison to 10-30% in k-anonymised datasets. The lesson that emerges is that automated l-diversity offers better privacy than k-anonymization and with negligible information loss.
引用
收藏
页码:105 / 120
页数:16
相关论文
共 36 条
  • [1] Aggarwal CC, 2008, ADV DATABASE SYST, V34, P1
  • [2] On unifying privacy and uncertain data models
    Aggarwal, Charu C.
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 386 - 395
  • [3] [Anonymous], 2012, EC MIDDLE E AFRICA
  • [4] [Anonymous], 2005, VLDB, DOI DOI 10.5555/1083592.1083696
  • [5] Querying Encrypted Data
    Arasu, Arvind
    Eguro, Ken
    Kaushik, Raghav
    Ramamurthy, Ravi
    [J]. SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, : 1259 - 1261
  • [6] New stopping criterion for genetic algorithms
    Aytug, H
    Koehler, GJ
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2000, 126 (03) : 662 - 674
  • [7] Bayardo RJ, 2005, PROC INT CONF DATA, P217
  • [8] K-Anonymity for Privacy Preserving Crime Data Publishing in Resource Constrained Environments
    Burke, Mark-John
    Kayem, Anne V. D. M.
    [J]. 2014 28TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS (WAINA), 2014, : 833 - 840
  • [9] Ciriani V, 2008, ADV DATABASE SYST, V34, P105
  • [10] Combining Fragmentation and Encryption to Protect Privacy in Data Storage
    Ciriani, Valentina
    Di Vimercati, Sabrina De Capitani
    Foresti, Sara
    Jajodia, Sushil
    Paraboschi, Stefano
    Samarati, Pierangela
    [J]. ACM TRANSACTIONS ON INFORMATION AND SYSTEM SECURITY, 2010, 13 (03)