Implications of Data Anonymization on the Statistical Evidence of Disparity

被引：7

作者：

Xu, Heng ^{[1
]}

Zhang, Nan ^{[1
]}

机构：

[1] Amer Univ, Kogod Sch Business, Washington, DC 20016 USA

来源：

MANAGEMENT SCIENCE | 2022年 / 68卷 / 04期

基金：

美国国家科学基金会;

关键词：

privacy; data anonymization; discrimination; statistical disparity; DIFFERENTIAL PRIVACY; HEALTH DISPARITIES; K-ANONYMITY; BIAS; DISCRIMINATION; PROTECTION; ACCURACY; SECURITY; PROOF;

D O I：

10.1287/mnsc.2021.4028

中图分类号：

C93 [管理学];

学科分类号：

12 ; 1201 ; 1202 ; 120202 ;

摘要：

Research and practical development of data-anonymization techniques have proliferated in recent years. Yet, limited attention has been paid to examine the potentially disparate impact of privacy protection on underprivileged subpopulations. This study is one of the first attempts to examine the extent to which data anonymization could mask the gross statistical disparities between subpopulations in the data. We first describe two common mechanisms of data anonymization and two prevalent types of statistical evidence for disparity. Then, we develop conceptual foundation and mathematical formalism demonstrating that the two data-anonymization mechanisms have distinctive impacts on the identifiability of disparity, which also varies based on its statistical operationalization. After validating our findings with empirical evidence, we discuss the business and policy implications, highlighting the need for firms and policy makers to balance between the protection of privacy and the recognition/rectification of disparate impact.

引用

页码：2600 / 2618

页数：20

共 50 条

[41] Pattern-Guided Data Anonymization and Clustering
Bredereck, Robert
Nichterlein, Andre
Niedermeier, Rolf
Philip, Geevarghese
MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE 2011, 2011, 6907 : 182 - 193
[42] k-NDDP: An Efficient Anonymization Model for Social Network Data Release
Shakeel, Shafaq
Anjum, Adeel
Asheralieva, Alia
Alam, Masoom
ELECTRONICS, 2021, 10 (19)
[43] Steered Microaggregation: A Unified Primitive for Anonymization of Data Sets and Data Streams
Domingo-Ferrer, Josep
Soria-Comas, Jordi
2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2017), 2017, : 995 - 1002
[44] Parallel privacy preservation through partitioning (P4): a scalable data anonymization algorithm for health data
Halilovic, Mehmed
Meurers, Thierry
Otte, Karen
Prasser, Fabian
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2025, 25 (01)
[45] Android Sensor Data Anonymization
Claiborne, Cynthia
Fazeen, Mohamed
Dantu, Ram
RESEARCH IN ATTACKS, INTRUSIONS, AND DEFENSES, 2013, 8145 : 469 - 471
[46] EXTENDING SUPPRESSION FOR ANONYMIZATION ON SET-VALUED DATA
Wang, Shyue-Liang
Tsai, Yu-Chuan
Kao, Hung-Yu
Hong, Tzung-Pei
INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2011, 7 (12): : 6849 - 6863
[47] Analysis of Data Anonymization Techniques
Marques, Joana Ferreira
Bernardino, Jorge
PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KEOD), VOL 2, 2020, : 235 - 241
[48] The effect of homogeneity on the computational complexity of combinatorial data anonymization
Bredereck, Robert
Nichterlein, Andre
Niedermeier, Rolf
Philip, Geevarghese
DATA MINING AND KNOWLEDGE DISCOVERY, 2014, 28 (01) : 65 - 91
[49] Utility-preserving anonymization for health data publishing
Lee, Hyukki
Kim, Soohyung
Kim, Jong Wook
Chung, Yon Dohn
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2017, 17
[50] Mobile Sensor Data Anonymization
Malekzadeh, Mohammad
Clegg, Richard G.
Cavallaro, Andrea
Haddadi, Hamed
PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTERNET OF THINGS DESIGN AND IMPLEMENTATION (IOTDI '19), 2019, : 49 - 58

← 1 2 3 4 5 →