PreCoF: counterfactual explanations for fairness

被引：0

作者：

Sofie Goethals

David Martens

Toon Calders

机构：

[1] University of Antwerp,Department of Engineering Management

[2] University of Antwerp,Department of Computer Science

来源：

Machine Learning | 2024年 / 113卷

关键词：

Explainable Artificial Intelligence; Counterfactual explanations; Fairness; Data science ethics;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

This paper studies how counterfactual explanations can be used to assess the fairness of a model. Using machine learning for high-stakes decisions is a threat to fairness as these models can amplify bias present in the dataset, and there is no consensus on a universal metric to detect this. The appropriate metric and method to tackle the bias in a dataset will be case-dependent, and it requires insight into the nature of the bias first. We aim to provide this insight by integrating explainable AI (XAI) research with the fairness domain. More specifically, apart from being able to use (Predictive) Counterfactual Explanations to detect explicit bias when the model is directly using the sensitive attribute, we show that it can also be used to detect implicit bias when the model does not use the sensitive attribute directly but does use other correlated attributes leading to a substantial disadvantage for a protected group. We call this metric PreCoF, or Predictive Counterfactual Fairness. Our experimental results show that our metric succeeds in detecting occurrences of implicit bias in the model by assessing which attributes are more present in the explanations of the protected group compared to the unprotected group. These results could help policymakers decide on whether this discrimination is justified or not.

引用

页码：3111 / 3142

页数：31

共 50 条

[1] PreCoF: counterfactual explanations for fairness
Goethals, Sofie
Martens, David
Calders, Toon
MACHINE LEARNING, 2024, 113 (05) : 3111 - 3142
[2] Counterfactual Fairness
Kusner, Matt
Loftus, Joshua
Russell, Chris
Silva, Ricardo
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[3] Counterfactual Visual Explanations
Goyal, Yash
Wu, Ziyan
Ernst, Jan
Batra, Dhruv
Parikh, Devi
Lee, Stefan
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[4] Counterfactual Causality and Historical Explanations
Gerber, Doris
EXPLANATION IN ACTION THEORY AND HISTORIOGRAPHY: CAUSAL AND TELEOLOGICAL APPROACHES, 2019, : 167 - 178
[5] Counterfactual Explanations for Models of Code
Cito, Juergen
Dillig, Isil
Murali, Vijayaraghavan
Chandra, Satish
2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2022), 2022, : 125 - 134
[6] On generating trustworthy counterfactual explanations
Del Ser, Javier
Barredo-Arrieta, Alejandro
Diaz-Rodriguez, Natalia
Herrera, Francisco
Saranti, Anna
Holzinger, Andreas
INFORMATION SCIENCES, 2024, 655
[7] Diffusion Models for Counterfactual Explanations
Jeanneret, Guillaume
Simon, Loic
France, Frederic Jurie
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
[8] Diffusion Models for Counterfactual Explanations
Jeanneret, Guillaume
Simon, Loic
Jurie, Fredric
COMPUTER VISION - ACCV 2022, PT VII, 2023, 13847 : 219 - 237
[9] Counterfactual Explanations for Neural Recommenders
Tran, Khanh Hiep
Ghazimatin, Azin
Roy, Rishiraj Saha
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1627 - 1631
[10] Adversarial Counterfactual Visual Explanations
Jeanneret, Guillaume
Simon, Loic
Jurie, Frederic
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 16425 - 16435

← 1 2 3 4 5 →