Managing dimensionality in data privacy anonymization

被引：16

作者：

Zakerzadeh, Hessam ^{[1
]}

Aggarwal, Charu C. ^{[2
]}

Barker, Ken ^{[1
]}

机构：

[1] Univ Calgary, Calgary, AB, Canada

[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA

来源：

KNOWLEDGE AND INFORMATION SYSTEMS | 2016年 / 49卷 / 01期

关键词：

High-dimensional anonymization; Privacy; k-Anonymity; l-Diversity; Vertical fragmentation;

D O I：

10.1007/s10115-015-0906-8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The curse of dimensionality has remained a challenge for a wide variety of algorithms in data mining, clustering, classification, and privacy. Recently, it was shown that an increasing dimensionality makes the data resistant to effective privacy. The theoretical results seem to suggest that the dimensionality curse is a fundamental barrier to privacy preservation. However, in practice, we show that some of the common properties of real data can be leveraged in order to greatly ameliorate the negative effects of the curse of dimensionality. In real data sets, many dimensions contain high levels of inter-attribute correlations. Such correlations enable the use of a process known as vertical fragmentation in order to decompose the data into vertical subsets of smaller dimensionality. An information-theoretic criterion of mutual information is used in the vertical decomposition process. This allows the use of an anonymization process, which is based on combining results from multiple independent fragments. We present a general approach, which can be applied to the k-anonymity, l-diversity, and t-closeness models. In the presence of inter-attribute correlations, such an approach continues to be much more robust in higher dimensionality, without losing accuracy. We present experimental results illustrating the effectiveness of the approach. This approach is resilient enough to prevent identity, attribute, and membership disclosure attack.

引用

页码：341 / 373

页数：33

共 50 条

[1] Managing dimensionality in data privacy anonymization
Hessam Zakerzadeh
Charu C. Aggarwal
Ken Barker
Knowledge and Information Systems, 2016, 49 : 341 - 373
[2] Data privacy in the Internet of Things based on anonymization: A review
Neves, Flavio
Souza, Rafael
Sousa, Juliana
Bonfim, Michel
Garcia, Vinicius
JOURNAL OF COMPUTER SECURITY, 2023, 31 (03) : 261 - 291
[3] Anonymization of Daily Activity Data by Using l-diversity Privacy Model
Parameshwarappa, Pooja
Chen, Zhiyuan
Koru, Gunes
ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2021, 12 (03)
[4] Anonymization Techniques for Privacy Preserving Data Publishing: A Comprehensive Survey
Majeed, Abdul
Lee, Sungchang
IEEE ACCESS, 2021, 9 : 8512 - 8545
[5] Hybrid Data Privacy and Anonymization Algorithms for Smart Health Applications
Fakeeroodeen Y.N.
Beeharry Y.
SN Computer Science, 2021, 2 (2)
[6] Data anonymization to balance privacy and utility of online social media network data
Gangarde, Rupali
Shrivastava, Deepshikha
Sharma, Amit
Tandon, Tanishka
Pawar, Ambika
Garg, Rachit
JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2022, 25 (03) : 829 - 838
[7] Data Anonymization for Privacy Aware Machine Learning
Jaidan, David Nizar
Carrere, Maxime
Chemli, Zakaria
Poisvert, Remi
MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, 2019, 11943 : 725 - 737
[8] On the Role of Data Anonymization in Machine Learning Privacy
Senavirathne, Navoda
Torra, Vicenc
2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 664 - 675
[9] Anonymization : Securing privacy in IoT
Kaur, Jashanpreet
Sengupta, Jyotsna
JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2020, 41 (06) : 1463 - 1477
[10] Privacy Preserving Big data Using Combine Anonymization and Encryption Approach
Desai, Vidhi
Chauhan, Gargi K.
2019 INNOVATIONS IN POWER AND ADVANCED COMPUTING TECHNOLOGIES (I-PACT), 2019,

← 1 2 3 4 5 →