PreCoF: counterfactual explanations for fairness

被引:0
|
作者
Sofie Goethals
David Martens
Toon Calders
机构
[1] University of Antwerp,Department of Engineering Management
[2] University of Antwerp,Department of Computer Science
来源
Machine Learning | 2024年 / 113卷
关键词
Explainable Artificial Intelligence; Counterfactual explanations; Fairness; Data science ethics;
D O I
暂无
中图分类号
学科分类号
摘要
This paper studies how counterfactual explanations can be used to assess the fairness of a model. Using machine learning for high-stakes decisions is a threat to fairness as these models can amplify bias present in the dataset, and there is no consensus on a universal metric to detect this. The appropriate metric and method to tackle the bias in a dataset will be case-dependent, and it requires insight into the nature of the bias first. We aim to provide this insight by integrating explainable AI (XAI) research with the fairness domain. More specifically, apart from being able to use (Predictive) Counterfactual Explanations to detect explicit bias when the model is directly using the sensitive attribute, we show that it can also be used to detect implicit bias when the model does not use the sensitive attribute directly but does use other correlated attributes leading to a substantial disadvantage for a protected group. We call this metric PreCoF, or Predictive Counterfactual Fairness. Our experimental results show that our metric succeeds in detecting occurrences of implicit bias in the model by assessing which attributes are more present in the explanations of the protected group compared to the unprotected group. These results could help policymakers decide on whether this discrimination is justified or not.
引用
收藏
页码:3111 / 3142
页数:31
相关论文
共 50 条
  • [1] PreCoF: counterfactual explanations for fairness
    Goethals, Sofie
    Martens, David
    Calders, Toon
    MACHINE LEARNING, 2024, 113 (05) : 3111 - 3142
  • [2] Counterfactual Fairness
    Kusner, Matt
    Loftus, Joshua
    Russell, Chris
    Silva, Ricardo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [3] Counterfactual Visual Explanations
    Goyal, Yash
    Wu, Ziyan
    Ernst, Jan
    Batra, Dhruv
    Parikh, Devi
    Lee, Stefan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [4] Counterfactual Causality and Historical Explanations
    Gerber, Doris
    EXPLANATION IN ACTION THEORY AND HISTORIOGRAPHY: CAUSAL AND TELEOLOGICAL APPROACHES, 2019, : 167 - 178
  • [5] Counterfactual Explanations for Models of Code
    Cito, Juergen
    Dillig, Isil
    Murali, Vijayaraghavan
    Chandra, Satish
    2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2022), 2022, : 125 - 134
  • [6] On generating trustworthy counterfactual explanations
    Del Ser, Javier
    Barredo-Arrieta, Alejandro
    Diaz-Rodriguez, Natalia
    Herrera, Francisco
    Saranti, Anna
    Holzinger, Andreas
    INFORMATION SCIENCES, 2024, 655
  • [7] Diffusion Models for Counterfactual Explanations
    Jeanneret, Guillaume
    Simon, Loic
    France, Frederic Jurie
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
  • [8] Diffusion Models for Counterfactual Explanations
    Jeanneret, Guillaume
    Simon, Loic
    Jurie, Fredric
    COMPUTER VISION - ACCV 2022, PT VII, 2023, 13847 : 219 - 237
  • [9] Counterfactual Explanations for Neural Recommenders
    Tran, Khanh Hiep
    Ghazimatin, Azin
    Roy, Rishiraj Saha
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1627 - 1631
  • [10] Adversarial Counterfactual Visual Explanations
    Jeanneret, Guillaume
    Simon, Loic
    Jurie, Frederic
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 16425 - 16435