Implications of Data Anonymization on the Statistical Evidence of Disparity

被引:7
|
作者
Xu, Heng [1 ]
Zhang, Nan [1 ]
机构
[1] Amer Univ, Kogod Sch Business, Washington, DC 20016 USA
基金
美国国家科学基金会;
关键词
privacy; data anonymization; discrimination; statistical disparity; DIFFERENTIAL PRIVACY; HEALTH DISPARITIES; K-ANONYMITY; BIAS; DISCRIMINATION; PROTECTION; ACCURACY; SECURITY; PROOF;
D O I
10.1287/mnsc.2021.4028
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
Research and practical development of data-anonymization techniques have proliferated in recent years. Yet, limited attention has been paid to examine the potentially disparate impact of privacy protection on underprivileged subpopulations. This study is one of the first attempts to examine the extent to which data anonymization could mask the gross statistical disparities between subpopulations in the data. We first describe two common mechanisms of data anonymization and two prevalent types of statistical evidence for disparity. Then, we develop conceptual foundation and mathematical formalism demonstrating that the two data-anonymization mechanisms have distinctive impacts on the identifiability of disparity, which also varies based on its statistical operationalization. After validating our findings with empirical evidence, we discuss the business and policy implications, highlighting the need for firms and policy makers to balance between the protection of privacy and the recognition/rectification of disparate impact.
引用
收藏
页码:2600 / 2618
页数:20
相关论文
共 50 条
  • [31] On Anonymization of String Data
    Aggarwal, Charu C.
    Yu, Philip S.
    PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 419 - 424
  • [32] Feature Based Data Anonymization with Slicing Method for Data Publishing
    Gachanga, Esther
    Kimwele, Michael
    Nderu, Lawrence
    ICMLC 2019: 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2019, : 274 - 279
  • [33] Data Anonymization Based on Natural Equivalent Class
    Guo, Naixuan
    Yang, Ming
    Gong, Qiyuan
    Chen, Zhouguo
    Luo, Junzhou
    PROCEEDINGS OF THE 2019 IEEE 23RD INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2019, : 22 - 27
  • [34] A Study of Performance Enhancement in Big Data Anonymization
    Jang, Sung-Bong
    2017 4TH INTERNATIONAL CONFERENCE ON COMPUTER APPLICATIONS AND INFORMATION PROCESSING TECHNOLOGY (CAIPT), 2017,
  • [35] Classification utility aware data stream anonymization
    Sopaoglu, Ugur
    Abul, Osman
    APPLIED SOFT COMPUTING, 2021, 110
  • [36] Scalable Distributed Data Anonymization for Large Datasets
    di Vimercati, Sabrina De Capitani
    Facchinetti, Dario
    Foresti, Sara
    Livraga, Giovanni
    Oldani, Gianluca
    Paraboschi, Stefano
    Rossi, Matthew
    Samarati, Pierangela
    IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (03) : 818 - 831
  • [37] When Poor-Quality Data Meet Anonymization Models: Threats and Countermeasures
    Majeed, Abdul
    Hwang, Seong Oun
    IEEE ACCESS, 2025, 13 : 49457 - 49475
  • [38] Data anonymization: a novel optimal k-anonymity algorithm for identical generalization hierarchy data in IoT
    Mahanan, Waranya
    Chaovalitwongse, W. Art
    Natwichai, Juggapong
    SERVICE ORIENTED COMPUTING AND APPLICATIONS, 2020, 14 (02) : 89 - 100
  • [39] On the Role of Data Anonymization in Machine Learning Privacy
    Senavirathne, Navoda
    Torra, Vicenc
    2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 664 - 675
  • [40] Flexible data anonymization using ARX-Current status and challenges ahead
    Prasser, Fabian
    Eicher, Johanna
    Spengler, Helmut
    Bild, Raffael
    Kuhn, Klaus A.
    SOFTWARE-PRACTICE & EXPERIENCE, 2020, 50 (07) : 1277 - 1304