In-Training Explainability Frameworks: A Method to Make Black-Box Machine Learning Models More Explainable

被引:0
|
作者
Acun, Cagla [1 ]
Nasraoui, Olfa [1 ]
机构
[1] Univ Louisville, Web Min & Knowledge Discovery Lab, Louisville, KY 40292 USA
来源
2023 IEEE INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY, WI-IAT | 2023年
关键词
Explainability in Artificial Intelligence; XAI;
D O I
10.1109/WI-IAT59888.2023.00036
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite ongoing efforts to make black-box machine learning models more explainable, transparent, and trustworthy, there is a growing advocacy for using only inherently interpretable models for high-stake decision making. For instance, post-hoc explanations have recently been criticized because they learn surrogate white-box (explainer) models that, while optimized to approximate the original predictive model, remain different from the latter. Moreover, the post-hoc models necessitate a post-hoc training phase at prediction time, that adds to the computational burden. In this paper, we propose two novel explainability approaches that make black-box models more explainable, which we call pre-hoc explainability and co-hoc explainability. Our goal is to maintain the black-box model's prediction accuracy while benefiting from the explanations that come with an inherently interpretable white-box model, and without the need for a post-hoc training phase at prediction time. In contrast to post-hoc methods, the black-box model training phase is guided by explanations that are used as a regularizer. Our experiments demonstrate the advantages of our proposed technique on three real-life datasets, in terms of fidelity, without compromising accuracy.
引用
收藏
页码:230 / 237
页数:8
相关论文
共 4 条
  • [1] Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence
    Vikas Hassija
    Vinay Chamola
    Atmesh Mahapatra
    Abhinandan Singal
    Divyansh Goel
    Kaizhu Huang
    Simone Scardapane
    Indro Spinelli
    Mufti Mahmud
    Amir Hussain
    Cognitive Computation, 2024, 16 : 45 - 74
  • [2] Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence
    Hassija, Vikas
    Chamola, Vinay
    Mahapatra, Atmesh
    Singal, Abhinandan
    Goel, Divyansh
    Huang, Kaizhu
    Scardapane, Simone
    Spinelli, Indro
    Mahmud, Mufti
    Hussain, Amir
    COGNITIVE COMPUTATION, 2024, 16 (01) : 45 - 74
  • [3] Demystifying the black box: an overview of explainability methods in machine learning
    Kinger S.
    Kulkarni V.
    International Journal of Computers and Applications, 2024, 46 (02) : 90 - 100
  • [4] Explainable artificial intelligence enhances the ecological interpretability of black-box species distribution models
    Ryo, Masahiro
    Angelov, Boyan
    Mammola, Stefano
    Kass, Jamie M.
    Benito, Blas M.
    Hartig, Florian
    ECOGRAPHY, 2021, 44 (02) : 199 - 205