Data Pre-Processing for Discrimination Prevention: Information-Theoretic Optimization and Analysis

被引:18
|
作者
Calmon, Flavio du Pin [1 ]
Wei, Dennis [2 ]
Vinzamuri, Bhanukiran [2 ]
Ramamurthy, Karthikeyan Natesan [2 ]
Varshney, Kush R. [2 ]
机构
[1] Harvard Univ, John A Paulson Sch Engn & Appl Sci, Cambridge, MA 02138 USA
[2] IBM Res AI, Yorktown Hts, NY 10598 USA
关键词
Machine learning; ethics; optimization;
D O I
10.1109/JSTSP.2018.2865887
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce a novel probabilistic formulation of data pre-processing for reducing discrimination. We propose a convex optimization for learning a data transformation with three goals: controlling group discrimination, limiting distortion in individual data samples, and preserving utility. Several theoretical properties are established, including conditions for convexity, a characterization of the impact of limited sample size on discrimination and utility guarantees, and a connection between discrimination and estimation. Two instances of the proposed optimization are applied to datasets, including one on real-world criminal recidivism. Results show that discrimination can be greatly reduced at a small cost in classification accuracy and with precise control of individual distortion.
引用
收藏
页码:1106 / 1119
页数:14
相关论文
共 50 条
  • [1] Optimized Pre-Processing for Discrimination Prevention
    Calmon, Flavio P.
    Wei, Dennis
    Vinzamuri, Bhanukiran
    Ramamurthy, Karthikeyan Natesan
    Varshney, Kush R.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [2] Acceleration of information-theoretic data analysis with graphics processing units
    Sluga, Davor
    Curk, Tomaz
    Zupan, Blaz
    Lotric, Uros
    PRZEGLAD ELEKTROTECHNICZNY, 2012, 88 (02): : 136 - 139
  • [3] Introduction to the Issue on Information-Theoretic Methods in Data Acquisition, Analysis, and Processing
    Rodrigues, M.
    Bolcskei, H.
    Draper, S.
    Eldar, Y.
    Tan, V.
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2018, 12 (05) : 821 - 824
  • [4] An Information-Theoretic Foundation for the Measurement of Discrimination Information
    Cai, Di
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2010, 22 (09) : 1262 - 1273
  • [5] An Information-Theoretic View of Array Processing
    Dmochowski, Jacek
    Benesty, Jacob
    Affes, Sofiene
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (02): : 392 - 401
  • [6] An Information-Theoretic Quantification of Discrimination with Exempt Features
    Dutta, Sanghamitra
    Venkatesh, Praveen
    Mardziel, Piotr
    Datta, Anupam
    Grover, Pulkit
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 3825 - 3833
  • [7] Information-theoretic Analysis of Bayesian Test Data Sensitivity
    Futami, Futoshi
    Iwata, Tomoharu
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [8] On the Direction of Discrimination: An Information-Theoretic Analysis of Disparate Impact in Machine Learning
    Wang, Hao
    Ustun, Berk
    Calmon, Flavio P.
    2018 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2018, : 1216 - 1220
  • [9] An Information-Theoretic Approach to Portfolio Optimization
    Djakam, W. Ngambou
    Tanik, Murat M.
    SOUTHEASTCON 2022, 2022, : 332 - 338
  • [10] Information-Theoretic Exploration with Bayesian Optimization
    Bai, Shi
    Wang, Jinkun
    Chen, Fanfei
    Englot, Brendan
    2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), 2016, : 1816 - 1822