Learning to Explain: A Model -Agnostic Framework for Explaining Black Box Models

被引:2
|
作者
Barkan, Oren [1 ]
Asher, Yuval [2 ]
Eshel, Amit [2 ]
Elisha, Yelionatan [1 ]
Koenigstein, Noam [2 ]
机构
[1] Open Univ, Milton Keynes, England
[2] Tel Aviv Univ, Tel Aviv, Israel
来源
23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023 | 2023年
基金
以色列科学基金会;
关键词
Explainable AI; computer vision; transformers;
D O I
10.1109/ICDM58522.2023.00105
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present Learning to Explain (LTX), a model-agnostic framework designed for providing post -hoc explanations for vision models. The LTX framework introduces an "explainer" model that generates explanation maps, highlighting the crucial regions that justify the predictions made by the model being explained. To train the explainer, we employ a two -stage process consisting of initial pretraining followed by per-instance finetuning. During both stages of training, we utilize a unique configuration where we compare the explained model's prediction for a masked input with its original prediction for the unmasked input. This approach enables the use of a novel counterfactual objective, which aims to anticipate the model's output using masked versions of the input image. Importantly, the LTX framework is not restricted to a specific model architecture and can provide explanations for both Transformer-based and convolutional models. Through our evaluations, we demonstrate that LTX significantly outperforms the current state-of-the-art in explainability across various metrics. Our code is available at: https://githab.cian/LTX-CodelLTX
引用
收藏
页码:944 / 949
页数:6
相关论文
共 50 条
  • [31] Learning Task-Agnostic Embedding of Multiple Black-Box Experts for Multi-Task Model Fusion
    Trong Nghia Hoang
    Chi Thanh Lam
    Low, Bryan Kian Hsiang
    Jaillet, Patrick
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [32] "Why Did You Do That?" Explaining Black Box Models with Inductive Synthesis
    Pacaci, Gorkem
    Johnson, David
    McKeever, Steve
    Hamfelt, Andreas
    COMPUTATIONAL SCIENCE - ICCS 2019, PT V, 2019, 11540 : 334 - 345
  • [33] Black Box Fairness Testing of Machine Learning Models
    Aggarwal, Aniya
    Lohia, Pranay
    Nagar, Seema
    Dey, Kuntal
    Saha, Diptikalyan
    ESEC/FSE'2019: PROCEEDINGS OF THE 2019 27TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2019, : 625 - 635
  • [34] Learning Groupwise Explanations for Black-Box Models
    Gao, Jingyue
    Wang, Xiting
    Wang, Yasha
    Yan, Yulan
    Xie, Xing
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2396 - 2402
  • [35] DREAM: Domain-Agnostic Reverse Engineering Attributes of Black-Box Model
    Li, Rongqing
    Yu, Jiaqi
    Li, Changsheng
    Luo, Wenhan
    Yuan, Ye
    Wang, Guoren
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (12) : 8009 - 8022
  • [36] ComplAI: Framework for Multi-factor Assessment of Black-Box Supervised Machine Learning Models
    De, Arkadipta
    Gudipudi, Satya Swaroop
    Panchanan, Sourab
    Desarkar, Maunendra Sankar
    38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023, 2023, : 1096 - 1099
  • [37] TREPAN Reloaded: A Knowledge-Driven Approach to Explaining Black-Box Models
    Confalonieri, Roberto
    Weyde, Tillman
    Besold, Tarek R.
    del Prado Martin, Fermin Moscoso
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2457 - 2464
  • [38] Explaining the black box: HPWS and organisational climate
    Cafferkey, Kenneth
    Dundon, Tony
    PERSONNEL REVIEW, 2015, 44 (05) : 666 - 688
  • [39] Machine Learning Workflow to Explain Black-Box Models for Early Alzheimer’s Disease Classification Evaluated for Multiple Datasets
    Bloch L.
    Friedrich C.M.
    SN Computer Science, 3 (6)
  • [40] ILIME: Local and Global Interpretable Model-Agnostic Explainer of Black-Box Decision
    ElShawi, Radwa
    Sherif, Youssef
    Al-Mallah, Mouaz
    Sakr, Sherif
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2019, 2019, 11695 : 53 - 68