Implications of Data Anonymization on the Statistical Evidence of Disparity

被引：7

作者：

Xu, Heng ^{[1
]}

Zhang, Nan ^{[1
]}

机构：

[1] Amer Univ, Kogod Sch Business, Washington, DC 20016 USA

来源：

MANAGEMENT SCIENCE | 2022年 / 68卷 / 04期

基金：

美国国家科学基金会;

关键词：

privacy; data anonymization; discrimination; statistical disparity; DIFFERENTIAL PRIVACY; HEALTH DISPARITIES; K-ANONYMITY; BIAS; DISCRIMINATION; PROTECTION; ACCURACY; SECURITY; PROOF;

D O I：

10.1287/mnsc.2021.4028

中图分类号：

C93 [管理学];

学科分类号：

12 ; 1201 ; 1202 ; 120202 ;

摘要：

Research and practical development of data-anonymization techniques have proliferated in recent years. Yet, limited attention has been paid to examine the potentially disparate impact of privacy protection on underprivileged subpopulations. This study is one of the first attempts to examine the extent to which data anonymization could mask the gross statistical disparities between subpopulations in the data. We first describe two common mechanisms of data anonymization and two prevalent types of statistical evidence for disparity. Then, we develop conceptual foundation and mathematical formalism demonstrating that the two data-anonymization mechanisms have distinctive impacts on the identifiability of disparity, which also varies based on its statistical operationalization. After validating our findings with empirical evidence, we discuss the business and policy implications, highlighting the need for firms and policy makers to balance between the protection of privacy and the recognition/rectification of disparate impact.

引用

页码：2600 / 2618

页数：20

共 50 条

[31] On Anonymization of String Data
Aggarwal, Charu C.
Yu, Philip S.
PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 419 - 424
[32] Feature Based Data Anonymization with Slicing Method for Data Publishing
Gachanga, Esther
Kimwele, Michael
Nderu, Lawrence
ICMLC 2019: 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2019, : 274 - 279
[33] Data Anonymization Based on Natural Equivalent Class
Guo, Naixuan
Yang, Ming
Gong, Qiyuan
Chen, Zhouguo
Luo, Junzhou
PROCEEDINGS OF THE 2019 IEEE 23RD INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2019, : 22 - 27
[34] A Study of Performance Enhancement in Big Data Anonymization
Jang, Sung-Bong
2017 4TH INTERNATIONAL CONFERENCE ON COMPUTER APPLICATIONS AND INFORMATION PROCESSING TECHNOLOGY (CAIPT), 2017,
[35] Classification utility aware data stream anonymization
Sopaoglu, Ugur
Abul, Osman
APPLIED SOFT COMPUTING, 2021, 110
[36] Scalable Distributed Data Anonymization for Large Datasets
di Vimercati, Sabrina De Capitani
Facchinetti, Dario
Foresti, Sara
Livraga, Giovanni
Oldani, Gianluca
Paraboschi, Stefano
Rossi, Matthew
Samarati, Pierangela
IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (03) : 818 - 831
[37] When Poor-Quality Data Meet Anonymization Models: Threats and Countermeasures
Majeed, Abdul
Hwang, Seong Oun
IEEE ACCESS, 2025, 13 : 49457 - 49475
[38] Data anonymization: a novel optimal k-anonymity algorithm for identical generalization hierarchy data in IoT
Mahanan, Waranya
Chaovalitwongse, W. Art
Natwichai, Juggapong
SERVICE ORIENTED COMPUTING AND APPLICATIONS, 2020, 14 (02) : 89 - 100
[39] On the Role of Data Anonymization in Machine Learning Privacy
Senavirathne, Navoda
Torra, Vicenc
2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 664 - 675
[40] Flexible data anonymization using ARX-Current status and challenges ahead
Prasser, Fabian
Eicher, Johanna
Spengler, Helmut
Bild, Raffael
Kuhn, Klaus A.
SOFTWARE-PRACTICE & EXPERIENCE, 2020, 50 (07) : 1277 - 1304

← 1 2 3 4 5 →