In-Training Explainability Frameworks: A Method to Make Black-Box Machine Learning Models More Explainable

被引：0

作者：

Acun, Cagla ^{[1
]}

Nasraoui, Olfa ^{[1
]}

机构：

[1] Univ Louisville, Web Min & Knowledge Discovery Lab, Louisville, KY 40292 USA

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY, WI-IAT | 2023年

关键词：

Explainability in Artificial Intelligence; XAI;

D O I：

10.1109/WI-IAT59888.2023.00036

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Despite ongoing efforts to make black-box machine learning models more explainable, transparent, and trustworthy, there is a growing advocacy for using only inherently interpretable models for high-stake decision making. For instance, post-hoc explanations have recently been criticized because they learn surrogate white-box (explainer) models that, while optimized to approximate the original predictive model, remain different from the latter. Moreover, the post-hoc models necessitate a post-hoc training phase at prediction time, that adds to the computational burden. In this paper, we propose two novel explainability approaches that make black-box models more explainable, which we call pre-hoc explainability and co-hoc explainability. Our goal is to maintain the black-box model's prediction accuracy while benefiting from the explanations that come with an inherently interpretable white-box model, and without the need for a post-hoc training phase at prediction time. In contrast to post-hoc methods, the black-box model training phase is guided by explanations that are used as a regularizer. Our experiments demonstrate the advantages of our proposed technique on three real-life datasets, in terms of fidelity, without compromising accuracy.

引用

页码：230 / 237

页数：8

共 4 条

[1] Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence
Vikas Hassija
Vinay Chamola
Atmesh Mahapatra
Abhinandan Singal
Divyansh Goel
Kaizhu Huang
Simone Scardapane
Indro Spinelli
Mufti Mahmud
Amir Hussain
Cognitive Computation, 2024, 16 : 45 - 74
[2] Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence
Hassija, Vikas
Chamola, Vinay
Mahapatra, Atmesh
Singal, Abhinandan
Goel, Divyansh
Huang, Kaizhu
Scardapane, Simone
Spinelli, Indro
Mahmud, Mufti
Hussain, Amir
COGNITIVE COMPUTATION, 2024, 16 (01) : 45 - 74
[3] Demystifying the black box: an overview of explainability methods in machine learning
Kinger S.
Kulkarni V.
International Journal of Computers and Applications, 2024, 46 (02) : 90 - 100
[4] Explainable artificial intelligence enhances the ecological interpretability of black-box species distribution models
Ryo, Masahiro
Angelov, Boyan
Mammola, Stefano
Kass, Jamie M.
Benito, Blas M.
Hartig, Florian
ECOGRAPHY, 2021, 44 (02) : 199 - 205

← 1 →