Implications of Data Anonymization on the Statistical Evidence of Disparity

被引:7
|
作者
Xu, Heng [1 ]
Zhang, Nan [1 ]
机构
[1] Amer Univ, Kogod Sch Business, Washington, DC 20016 USA
基金
美国国家科学基金会;
关键词
privacy; data anonymization; discrimination; statistical disparity; DIFFERENTIAL PRIVACY; HEALTH DISPARITIES; K-ANONYMITY; BIAS; DISCRIMINATION; PROTECTION; ACCURACY; SECURITY; PROOF;
D O I
10.1287/mnsc.2021.4028
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
Research and practical development of data-anonymization techniques have proliferated in recent years. Yet, limited attention has been paid to examine the potentially disparate impact of privacy protection on underprivileged subpopulations. This study is one of the first attempts to examine the extent to which data anonymization could mask the gross statistical disparities between subpopulations in the data. We first describe two common mechanisms of data anonymization and two prevalent types of statistical evidence for disparity. Then, we develop conceptual foundation and mathematical formalism demonstrating that the two data-anonymization mechanisms have distinctive impacts on the identifiability of disparity, which also varies based on its statistical operationalization. After validating our findings with empirical evidence, we discuss the business and policy implications, highlighting the need for firms and policy makers to balance between the protection of privacy and the recognition/rectification of disparate impact.
引用
收藏
页码:2600 / 2618
页数:20
相关论文
共 50 条
  • [41] Pattern-Guided Data Anonymization and Clustering
    Bredereck, Robert
    Nichterlein, Andre
    Niedermeier, Rolf
    Philip, Geevarghese
    MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE 2011, 2011, 6907 : 182 - 193
  • [42] k-NDDP: An Efficient Anonymization Model for Social Network Data Release
    Shakeel, Shafaq
    Anjum, Adeel
    Asheralieva, Alia
    Alam, Masoom
    ELECTRONICS, 2021, 10 (19)
  • [43] Steered Microaggregation: A Unified Primitive for Anonymization of Data Sets and Data Streams
    Domingo-Ferrer, Josep
    Soria-Comas, Jordi
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2017), 2017, : 995 - 1002
  • [44] Parallel privacy preservation through partitioning (P4): a scalable data anonymization algorithm for health data
    Halilovic, Mehmed
    Meurers, Thierry
    Otte, Karen
    Prasser, Fabian
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2025, 25 (01)
  • [45] Android Sensor Data Anonymization
    Claiborne, Cynthia
    Fazeen, Mohamed
    Dantu, Ram
    RESEARCH IN ATTACKS, INTRUSIONS, AND DEFENSES, 2013, 8145 : 469 - 471
  • [46] EXTENDING SUPPRESSION FOR ANONYMIZATION ON SET-VALUED DATA
    Wang, Shyue-Liang
    Tsai, Yu-Chuan
    Kao, Hung-Yu
    Hong, Tzung-Pei
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2011, 7 (12): : 6849 - 6863
  • [47] Analysis of Data Anonymization Techniques
    Marques, Joana Ferreira
    Bernardino, Jorge
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KEOD), VOL 2, 2020, : 235 - 241
  • [48] The effect of homogeneity on the computational complexity of combinatorial data anonymization
    Bredereck, Robert
    Nichterlein, Andre
    Niedermeier, Rolf
    Philip, Geevarghese
    DATA MINING AND KNOWLEDGE DISCOVERY, 2014, 28 (01) : 65 - 91
  • [49] Utility-preserving anonymization for health data publishing
    Lee, Hyukki
    Kim, Soohyung
    Kim, Jong Wook
    Chung, Yon Dohn
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2017, 17
  • [50] Mobile Sensor Data Anonymization
    Malekzadeh, Mohammad
    Clegg, Richard G.
    Cavallaro, Andrea
    Haddadi, Hamed
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTERNET OF THINGS DESIGN AND IMPLEMENTATION (IOTDI '19), 2019, : 49 - 58