Anonymization in the time of big data

被引:0
作者
Domingo-Ferrer J. [1 ]
Soria-Comas J. [1 ]
机构
[1] Department of Computer Engineering and Mathematics, Universitat Rovira i Virgili, Av. Països Catalans 26, Tarragona, 43007, CA
来源
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | 2016年 / 9867 LNCS卷
关键词
Big data; Curse of dimensionality; Data anonymization; K-anonymity; Multiple releases;
D O I
10.1007/978-3-319-45381-15
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this work we explore how viable is anonymization to prevent disclosure in structured big data. For the sake of concreteness, we focus on k-anonymity, which is the best-known privacy model based on anonymization. We identify two main challenges to use k-anonymity in big data. First, confidential attributes can also be quasi-identifier attributes, which increases the number of quasi-identifier attributes and may lead to a large information loss to attain k-anonymity. Second, in big data there is an unlimited number of data controllers, who may publish independent k-anonymous releases on overlapping populations of subjects; the k-anonymity guarantee does not longer hold if an observer pools such independent releases. We propose solutions to deal with the above two challenges. Our conclusion is that, with the proposed adjustments, k-anonymity is still useful in a context of big data. © Springer International Publishing Switzerland 2016.
引用
收藏
页码:57 / 68
页数:11
相关论文
共 50 条
  • [41] Privacy preserving big data publishing: a scalable k-anonymization approach using MapReduce
    Mehta, Brijesh B.
    Rao, Udai Pratap
    IET SOFTWARE, 2017, 11 (05) : 271 - 276
  • [42] Privacy and utility preserving data clustering for data anonymization and distribution on Hadoop
    Nayahi, J. Jesu Vedha
    Kavitha, V.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 74 : 393 - 408
  • [43] Analysis of Data Anonymization Techniques
    Marques, Joana Ferreira
    Bernardino, Jorge
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KEOD), VOL 2, 2020, : 235 - 241
  • [44] Utility-preserving anonymization for health data publishing
    Lee, Hyukki
    Kim, Soohyung
    Kim, Jong Wook
    Chung, Yon Dohn
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2017, 17
  • [45] A Review of Anonymization for Healthcare Data
    Olatunji, Iyiola E.
    Rauch, Jens
    Katzensteiner, Matthias
    Khosla, Megha
    BIG DATA, 2022, : 538 - 555
  • [46] Data privacy in the Internet of Things based on anonymization: A review
    Neves, Flavio
    Souza, Rafael
    Sousa, Juliana
    Bonfim, Michel
    Garcia, Vinicius
    JOURNAL OF COMPUTER SECURITY, 2023, 31 (03) : 261 - 291
  • [47] Utility-preserving anonymization for health data publishing
    Hyukki Lee
    Soohyung Kim
    Jong Wook Kim
    Yon Dohn Chung
    BMC Medical Informatics and Decision Making, 17
  • [48] On the identity anonymization of high-dimensional rating data
    Sun, Xiaoxun
    Wang, Hua
    Zhang, Yanchun
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2012, 24 (10) : 1108 - 1122
  • [49] Interactive Anonymization of Sensitive Data
    Xiao, Xiaokui
    Wang, Guozhang
    Gehrke, Johannes
    ACM SIGMOD/PODS 2009 CONFERENCE, 2009, : 1051 - 1053
  • [50] Data Anonymization With Diversity Constraints
    Milani, Mostafa
    Huang, Yu
    Chiang, Fei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (04) : 3603 - 3618