On Sampling, Anonymization, and Differential Privacy Or, K-Anonymization Meets Differential Privacy

被引:0
作者
Li, Ninghui [1 ]
Qardaji, Wahbeh [1 ]
Su, Dong [1 ]
机构
[1] Purdue Univ, 305 N Univ St, W Lafayette, IN 47907 USA
来源
7TH ACM SYMPOSIUM ON INFORMATION, COMPUTER AND COMMUNICATIONS SECURITY (ASIACCS 2012) | 2012年
基金
美国国家科学基金会;
关键词
Differential Privacy; Anonymization; Data Privacy; ANONYMITY;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper aims at answering the following two questions in privacy-preserving data analysis and publishing: What formal privacy guarantee (if any) does k-anonymization provide? How can we benefit from the adversary's uncertainty about the data? We have found that random sampling provides a connection that helps answer these two questions, as sampling can create uncertainty. The main result of the paper is that k-anonymization, when done "safely", and when preceded with a random sampling step, satisfies (epsilon, delta)-differential privacy with reasonable parameters. This result illustrates that "hiding in a crowd of k" indeed offers some privacy guarantees. We point out, however, that almost all existing k-anonymization algorithms in the literature are not "safe". Regarding the second question, we provide both positive and negative results. On the positive side, we show that adding a random-sampling pre-processing step to a differentially-private algorithm can greatly amplify the level of privacy protection. Hence, when given a dataset resulted from sampling, one can utilize a much large privacy budget. On the negative side, any privacy notion that takes advantage of the adversary's uncertainty, likely does not compose.
引用
收藏
页数:11
相关论文
共 50 条
[41]   Situating Anonymization Within a Privacy Risk Model [J].
Shapiro, Stuart S. .
2012 IEEE INTERNATIONAL SYSTEMS CONFERENCE (SYSCON), 2012, :651-656
[42]   On the Role of Data Anonymization in Machine Learning Privacy [J].
Senavirathne, Navoda ;
Torra, Vicenc .
2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, :664-675
[43]   A Scalable (α, k)-Anonymization Approach using MapReduce for Privacy Preserving Big Data Publishing [J].
Mehta, Brijesh B. ;
Gupta, Ruchika ;
Rao, Udai Pratap ;
Muthiyan, Mukesh .
2019 10TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2019,
[44]   Total Variation Meets Differential Privacy [J].
Ghazi, Elena ;
Issa, Ibrahim .
IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY, 2024, 5 :207-220
[45]   Anonymization Techniques for Privacy Preserving Data Publishing: A Comprehensive Survey [J].
Majeed, Abdul ;
Lee, Sungchang .
IEEE ACCESS, 2021, 9 :8512-8545
[46]   Achieving Perfect Location Privacy in Wireless Devices Using Anonymization [J].
Montazeri, Zarrin ;
Houmansadr, Amir ;
Pishro-Nik, Hossein .
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2017, 12 (11) :2683-2698
[47]   Optimization algorithm for k-anonymization of datasets with low information loss [J].
Keisuke Murakami ;
Takeaki Uno .
International Journal of Information Security, 2018, 17 :631-644
[48]   Privacy Preserving Attribute-Focused Anonymization Scheme for Healthcare Data Publishing [J].
Onesimu, J. Andrew ;
Karthikeyan, J. ;
Eunice, Jennifer ;
Pomplun, Marc ;
Hien Dang .
IEEE ACCESS, 2022, 10 :86979-86997
[49]   MAGE: A semantics retaining K-anonymization method for mixed data [J].
Han, Jianmin ;
Yu, Juan ;
Mo, Yuchang ;
Lu, Jianfeng ;
Liu, Huawen .
KNOWLEDGE-BASED SYSTEMS, 2014, 55 :75-86
[50]   A Top-Down k-Anonymization Implementation for Apache Spark [J].
Sopaoglu, Ugur ;
Abul, Osman .
2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, :4513-4521