Implications of Data Anonymization on the Statistical Evidence of Disparity

被引：7

作者：

Xu, Heng ^{[1
]}

Zhang, Nan ^{[1
]}

机构：

[1] Amer Univ, Kogod Sch Business, Washington, DC 20016 USA

来源：

MANAGEMENT SCIENCE | 2022年 / 68卷 / 04期

基金：

美国国家科学基金会;

关键词：

privacy; data anonymization; discrimination; statistical disparity; DIFFERENTIAL PRIVACY; HEALTH DISPARITIES; K-ANONYMITY; BIAS; DISCRIMINATION; PROTECTION; ACCURACY; SECURITY; PROOF;

D O I：

10.1287/mnsc.2021.4028

中图分类号：

C93 [管理学];

学科分类号：

12 ; 1201 ; 1202 ; 120202 ;

摘要：

Research and practical development of data-anonymization techniques have proliferated in recent years. Yet, limited attention has been paid to examine the potentially disparate impact of privacy protection on underprivileged subpopulations. This study is one of the first attempts to examine the extent to which data anonymization could mask the gross statistical disparities between subpopulations in the data. We first describe two common mechanisms of data anonymization and two prevalent types of statistical evidence for disparity. Then, we develop conceptual foundation and mathematical formalism demonstrating that the two data-anonymization mechanisms have distinctive impacts on the identifiability of disparity, which also varies based on its statistical operationalization. After validating our findings with empirical evidence, we discuss the business and policy implications, highlighting the need for firms and policy makers to balance between the protection of privacy and the recognition/rectification of disparate impact.

引用

页码：2600 / 2618

页数：20

共 50 条

[1] A Review of Anonymization for Healthcare Data
Olatunji, Iyiola E.
Rauch, Jens
Katzensteiner, Matthias
Khosla, Megha
BIG DATA, 2022, : 538 - 555
[2] Efficient multimedia big data anonymization
Jang, Sung-Bong
Ko, Young-Woong
MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (17) : 17855 - 17872
[3] Hybrid Data Privacy and Anonymization Algorithms for Smart Health Applications
Fakeeroodeen Y.N.
Beeharry Y.
SN Computer Science, 2021, 2 (2)
[4] Anonymization of distribution feeder data using statistical distribution and parameter estimation approach
Ali, Muhammad
Prakash, Krishneel
Macana, Carlos
Rabiul, Md
Hussain, Akhtar
Pota, Hemanshu
SUSTAINABLE ENERGY TECHNOLOGIES AND ASSESSMENTS, 2022, 52
[5] An Efficient Big Data Anonymization Algorithm Based on Chaos and Perturbation Techniques
Eyupoglu, Can
Aydin, Muhammed Ali
Zaim, Abdul Halim
Sertbas, Ahmet
ENTROPY, 2018, 20 (05)
[6] A utility based approach for data stream anonymization
Sopaoglu, Ugur
Abul, Osman
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2020, 54 (03) : 605 - 631
[7] Anonymization of nominal data based on semantic marginality
Domingo-Ferrer, Josep
Sanchez, David
Rufian-Torrell, Guillem
INFORMATION SCIENCES, 2013, 242 : 35 - 48
[8] Privacy and utility preserving data clustering for data anonymization and distribution on Hadoop
Nayahi, J. Jesu Vedha
Kavitha, V.
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 74 : 393 - 408
[9] Data privacy in the Internet of Things based on anonymization: A review
Neves, Flavio
Souza, Rafael
Sousa, Juliana
Bonfim, Michel
Garcia, Vinicius
JOURNAL OF COMPUTER SECURITY, 2023, 31 (03) : 261 - 291
[10] Anonymization in the time of big data
Domingo-Ferrer J.
Soria-Comas J.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, 9867 LNCS : 57 - 68

← 1 2 3 4 5 →