Managing dimensionality in data privacy anonymization

被引:16
|
作者
Zakerzadeh, Hessam [1 ]
Aggarwal, Charu C. [2 ]
Barker, Ken [1 ]
机构
[1] Univ Calgary, Calgary, AB, Canada
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA
关键词
High-dimensional anonymization; Privacy; k-Anonymity; l-Diversity; Vertical fragmentation;
D O I
10.1007/s10115-015-0906-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The curse of dimensionality has remained a challenge for a wide variety of algorithms in data mining, clustering, classification, and privacy. Recently, it was shown that an increasing dimensionality makes the data resistant to effective privacy. The theoretical results seem to suggest that the dimensionality curse is a fundamental barrier to privacy preservation. However, in practice, we show that some of the common properties of real data can be leveraged in order to greatly ameliorate the negative effects of the curse of dimensionality. In real data sets, many dimensions contain high levels of inter-attribute correlations. Such correlations enable the use of a process known as vertical fragmentation in order to decompose the data into vertical subsets of smaller dimensionality. An information-theoretic criterion of mutual information is used in the vertical decomposition process. This allows the use of an anonymization process, which is based on combining results from multiple independent fragments. We present a general approach, which can be applied to the k-anonymity, l-diversity, and t-closeness models. In the presence of inter-attribute correlations, such an approach continues to be much more robust in higher dimensionality, without losing accuracy. We present experimental results illustrating the effectiveness of the approach. This approach is resilient enough to prevent identity, attribute, and membership disclosure attack.
引用
收藏
页码:341 / 373
页数:33
相关论文
共 50 条
  • [21] Supporting Streaming Data Anonymization with Expressions of User Privacy Preferences
    Sakpere, Aderonke Busayo
    Kayem, Anne V. D. M.
    INFORMATION SYSTEMS SECURITY AND PRIVACY, ICISSP 2015, 2015, 576 : 122 - 136
  • [22] Efficient multimedia big data anonymization
    Jang, Sung-Bong
    Ko, Young-Woong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (17) : 17855 - 17872
  • [23] Privacy preserving publication of relational and transaction data: Survey on the anonymization of patient data
    Puri, Vartika
    Sachdeva, Shelly
    Kaur, Parmeet
    COMPUTER SCIENCE REVIEW, 2019, 32 : 45 - 61
  • [24] Protecting Privacy in Knowledge Graphs With Personalized Anonymization
    Hoang, Anh-Tu
    Carminati, Barbara
    Ferrari, Elena
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2024, 21 (04) : 2181 - 2193
  • [25] Efficient k-anonymization for privacy preservation
    Liang, Z.
    Wei, R.
    PROCEEDINGS OF THE 2008 12TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, VOLS I AND II, 2008, : 737 - 742
  • [26] Parallel privacy preservation through partitioning (P4): a scalable data anonymization algorithm for health data
    Halilovic, Mehmed
    Meurers, Thierry
    Otte, Karen
    Prasser, Fabian
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2025, 25 (01)
  • [27] Performance Analysis of Various Anonymization Techniques for Privacy Preservation of Sensitive Data
    Sreevidya, B.
    Rajesh, M.
    Sasikala, T.
    INTERNATIONAL CONFERENCE ON INTELLIGENT DATA COMMUNICATION TECHNOLOGIES AND INTERNET OF THINGS, ICICI 2018, 2019, 26 : 687 - 693
  • [28] An anonymization protocol for continuous and dynamic privacy-preserving data collection
    Kim, Soohyung
    Chung, Yon Dohn
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 93 : 1065 - 1073
  • [29] Toward Scalable Anonymization for Privacy-Preserving Big Data Publishing
    Mehta, Brijesh B.
    Rao, Udai Pratap
    RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 2, 2018, 708 : 297 - 304
  • [30] Anonymization Technique through Record Elimination to Preserve Privacy of Published Data
    Mahesh, R.
    Meyyappan, T.
    2013 INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, INFORMATICS AND MEDICAL ENGINEERING (PRIME), 2013,