Contrasting Explanations for Understanding and Regularizing Model Adaptations

被引：3

作者：

Artelt, Andre ^{[1
]}

Hinder, Fabian ^{[1
]}

Vaquet, Valerie ^{[1
]}

Feldhans, Robert ^{[1
]}

Hammer, Barbara ^{[1
]}

机构：

[1] Bielefeld Univ, Fac Technol, Univ Str 25, D-33615 Bielefeld, Nrw, Germany

来源：

NEURAL PROCESSING LETTERS | 2023年 / 55卷 / 05期

关键词：

XAI; Contrasting explanations; Model adaptation; Human-centered AI;

D O I：

10.1007/s11063-022-10826-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Many of today's decision making systems deployed in the real world are not static-they are changing and adapting over time, a phenomenon known as model adaptation takes place. Because of their wide reaching influence and potentially serious consequences, the need for transparency and interpretability of AI-based decision making systems is widely accepted and thus have been worked on extensively-e.g. a very prominent class of explanations are contrasting explanations which try to mimic human explanations. However, usually, explanation methods assume a static system that has to be explained. Explaining non-static systems is still an open research question, which poses the challenge how to explain model differences, adaptations and changes. In this contribution, we propose and (empirically) evaluate a general framework for explaining model adaptations and differences by contrasting explanations. We also propose a method for automatically finding regions in data space that are affected by a given model adaptation-i.e. regions where the internal reasoning of the other (e.g. adapted) model changed-and thus should be explained. Finally, we also propose a regularization for model adaptations to ensure that the internal reasoning of the adapted model does not change in an unwanted way.

引用

页码：5273 / 5297

页数：25

共 40 条

[1]

Aamodt A, 1994, AI COMMUN

[2]

[Anonymous], 2016, The Atlantic

[3]

Artelt, 2021, ARXIV PREPRINT ARXIV

[4]

Artelt, 2019, GITHUB

[5]

Artelt, 2019, CORR ARXIV191107749

[6]

Artelt A, 2020, P PART LECT NOTES CO, V12396

[7] Efficient computation of counterfactual explanations and counterfactual metrics of prototype-based classifiers [J].

Artelt, Andre ;

Hammer, Barbara .

NEUROCOMPUTING, 2022, 470 :304-317

[8] Efficient computation of contrastive explanations [J].

Artelt, Andre ;

Hammer, Barbara .

2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,

[9] Bridging the Gap Between Ethics and Practice: Guidelines for Reliable, Safe, and Trustworthy Human-centered AI Systems [J].

Ben Shneiderman .

ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2020, 10 (04)

[10]

Botari T., 2020, Melime: Meaningful local explanation for machine learning models

← 1 2 3 4 →