Learning to Explain: A Model -Agnostic Framework for Explaining Black Box Models

被引：2

作者：

Barkan, Oren ^{[1
]}

Asher, Yuval ^{[2
]}

Eshel, Amit ^{[2
]}

Elisha, Yelionatan ^{[1
]}

Koenigstein, Noam ^{[2
]}

机构：

[1] Open Univ, Milton Keynes, England

[2] Tel Aviv Univ, Tel Aviv, Israel

来源：

23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023 | 2023年

基金：

以色列科学基金会;

关键词：

Explainable AI; computer vision; transformers;

D O I：

10.1109/ICDM58522.2023.00105

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present Learning to Explain (LTX), a model-agnostic framework designed for providing post -hoc explanations for vision models. The LTX framework introduces an "explainer" model that generates explanation maps, highlighting the crucial regions that justify the predictions made by the model being explained. To train the explainer, we employ a two -stage process consisting of initial pretraining followed by per-instance finetuning. During both stages of training, we utilize a unique configuration where we compare the explained model's prediction for a masked input with its original prediction for the unmasked input. This approach enables the use of a novel counterfactual objective, which aims to anticipate the model's output using masked versions of the input image. Importantly, the LTX framework is not restricted to a specific model architecture and can provide explanations for both Transformer-based and convolutional models. Through our evaluations, we demonstrate that LTX significantly outperforms the current state-of-the-art in explainability across various metrics. Our code is available at: https://githab.cian/LTX-CodelLTX

引用

页码：944 / 949

页数：6

共 50 条

[31] Learning Task-Agnostic Embedding of Multiple Black-Box Experts for Multi-Task Model Fusion
Trong Nghia Hoang
Chi Thanh Lam
Low, Bryan Kian Hsiang
Jaillet, Patrick
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[32] "Why Did You Do That?" Explaining Black Box Models with Inductive Synthesis
Pacaci, Gorkem
Johnson, David
McKeever, Steve
Hamfelt, Andreas
COMPUTATIONAL SCIENCE - ICCS 2019, PT V, 2019, 11540 : 334 - 345
[33] Black Box Fairness Testing of Machine Learning Models
Aggarwal, Aniya
Lohia, Pranay
Nagar, Seema
Dey, Kuntal
Saha, Diptikalyan
ESEC/FSE'2019: PROCEEDINGS OF THE 2019 27TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2019, : 625 - 635
[34] Learning Groupwise Explanations for Black-Box Models
Gao, Jingyue
Wang, Xiting
Wang, Yasha
Yan, Yulan
Xie, Xing
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2396 - 2402
[35] DREAM: Domain-Agnostic Reverse Engineering Attributes of Black-Box Model
Li, Rongqing
Yu, Jiaqi
Li, Changsheng
Luo, Wenhan
Yuan, Ye
Wang, Guoren
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (12) : 8009 - 8022
[36] ComplAI: Framework for Multi-factor Assessment of Black-Box Supervised Machine Learning Models
De, Arkadipta
Gudipudi, Satya Swaroop
Panchanan, Sourab
Desarkar, Maunendra Sankar
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023, 2023, : 1096 - 1099
[37] TREPAN Reloaded: A Knowledge-Driven Approach to Explaining Black-Box Models
Confalonieri, Roberto
Weyde, Tillman
Besold, Tarek R.
del Prado Martin, Fermin Moscoso
ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2457 - 2464
[38] Explaining the black box: HPWS and organisational climate
Cafferkey, Kenneth
Dundon, Tony
PERSONNEL REVIEW, 2015, 44 (05) : 666 - 688
[39] Machine Learning Workflow to Explain Black-Box Models for Early Alzheimer’s Disease Classification Evaluated for Multiple Datasets
Bloch L.
Friedrich C.M.
SN Computer Science, 3 (6)
[40] ILIME: Local and Global Interpretable Model-Agnostic Explainer of Black-Box Decision
ElShawi, Radwa
Sherif, Youssef
Al-Mallah, Mouaz
Sakr, Sherif
ADVANCES IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2019, 2019, 11695 : 53 - 68

← 1 2 3 4 5 →