On Sampling, Anonymization, and Differential Privacy Or, K-Anonymization Meets Differential Privacy

被引：0

作者：

Li, Ninghui ^{[1
]}

Qardaji, Wahbeh ^{[1
]}

Su, Dong ^{[1
]}

机构：

[1] Purdue Univ, 305 N Univ St, W Lafayette, IN 47907 USA

来源：

7TH ACM SYMPOSIUM ON INFORMATION, COMPUTER AND COMMUNICATIONS SECURITY (ASIACCS 2012) | 2012年

基金：

美国国家科学基金会;

关键词：

Differential Privacy; Anonymization; Data Privacy; ANONYMITY;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper aims at answering the following two questions in privacy-preserving data analysis and publishing: What formal privacy guarantee (if any) does k-anonymization provide? How can we benefit from the adversary's uncertainty about the data? We have found that random sampling provides a connection that helps answer these two questions, as sampling can create uncertainty. The main result of the paper is that k-anonymization, when done "safely", and when preceded with a random sampling step, satisfies (epsilon, delta)-differential privacy with reasonable parameters. This result illustrates that "hiding in a crowd of k" indeed offers some privacy guarantees. We point out, however, that almost all existing k-anonymization algorithms in the literature are not "safe". Regarding the second question, we provide both positive and negative results. On the positive side, we show that adding a random-sampling pre-processing step to a differentially-private algorithm can greatly amplify the level of privacy protection. Hence, when given a dataset resulted from sampling, one can utilize a much large privacy budget. On the negative side, any privacy notion that takes advantage of the adversary's uncertainty, likely does not compose.

引用

页数：11

共 50 条

[41] Situating Anonymization Within a Privacy Risk Model [J].

Shapiro, Stuart S. .

2012 IEEE INTERNATIONAL SYSTEMS CONFERENCE (SYSCON), 2012, :651-656

[42] On the Role of Data Anonymization in Machine Learning Privacy [J].

Senavirathne, Navoda ;

Torra, Vicenc .

2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, :664-675

[43] A Scalable (α, k)-Anonymization Approach using MapReduce for Privacy Preserving Big Data Publishing [J].

Mehta, Brijesh B. ;

Gupta, Ruchika ;

Rao, Udai Pratap ;

Muthiyan, Mukesh .

2019 10TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2019,

[44] Total Variation Meets Differential Privacy [J].

Ghazi, Elena ;

Issa, Ibrahim .

IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY, 2024, 5 :207-220

[45] Anonymization Techniques for Privacy Preserving Data Publishing: A Comprehensive Survey [J].

Majeed, Abdul ;

Lee, Sungchang .

IEEE ACCESS, 2021, 9 :8512-8545

[46] Achieving Perfect Location Privacy in Wireless Devices Using Anonymization [J].

Montazeri, Zarrin ;

Houmansadr, Amir ;

Pishro-Nik, Hossein .

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2017, 12 (11) :2683-2698

[47] Optimization algorithm for k-anonymization of datasets with low information loss [J].

Keisuke Murakami ;

Takeaki Uno .

International Journal of Information Security, 2018, 17 :631-644

[48] Privacy Preserving Attribute-Focused Anonymization Scheme for Healthcare Data Publishing [J].

Onesimu, J. Andrew ;

Karthikeyan, J. ;

Eunice, Jennifer ;

Pomplun, Marc ;

Hien Dang .

IEEE ACCESS, 2022, 10 :86979-86997

[49] MAGE: A semantics retaining K-anonymization method for mixed data [J].

Han, Jianmin ;

Yu, Juan ;

Mo, Yuchang ;

Lu, Jianfeng ;

Liu, Huawen .

KNOWLEDGE-BASED SYSTEMS, 2014, 55 :75-86

[50] A Top-Down k-Anonymization Implementation for Apache Spark [J].

Sopaoglu, Ugur ;

Abul, Osman .

2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, :4513-4521

← 1 2 3 4 5 →