Contrasting Explanations for Understanding and Regularizing Model Adaptations

被引:3
作者
Artelt, Andre [1 ]
Hinder, Fabian [1 ]
Vaquet, Valerie [1 ]
Feldhans, Robert [1 ]
Hammer, Barbara [1 ]
机构
[1] Bielefeld Univ, Fac Technol, Univ Str 25, D-33615 Bielefeld, Nrw, Germany
关键词
XAI; Contrasting explanations; Model adaptation; Human-centered AI;
D O I
10.1007/s11063-022-10826-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many of today's decision making systems deployed in the real world are not static-they are changing and adapting over time, a phenomenon known as model adaptation takes place. Because of their wide reaching influence and potentially serious consequences, the need for transparency and interpretability of AI-based decision making systems is widely accepted and thus have been worked on extensively-e.g. a very prominent class of explanations are contrasting explanations which try to mimic human explanations. However, usually, explanation methods assume a static system that has to be explained. Explaining non-static systems is still an open research question, which poses the challenge how to explain model differences, adaptations and changes. In this contribution, we propose and (empirically) evaluate a general framework for explaining model adaptations and differences by contrasting explanations. We also propose a method for automatically finding regions in data space that are affected by a given model adaptation-i.e. regions where the internal reasoning of the other (e.g. adapted) model changed-and thus should be explained. Finally, we also propose a regularization for model adaptations to ensure that the internal reasoning of the adapted model does not change in an unwanted way.
引用
收藏
页码:5273 / 5297
页数:25
相关论文
共 40 条
[1]  
Aamodt A, 1994, AI COMMUN
[2]  
[Anonymous], 2016, The Atlantic
[3]  
Artelt, 2021, ARXIV PREPRINT ARXIV
[4]  
Artelt, 2019, GITHUB
[5]  
Artelt, 2019, CORR ARXIV191107749
[6]  
Artelt A, 2020, P PART LECT NOTES CO, V12396
[7]   Efficient computation of counterfactual explanations and counterfactual metrics of prototype-based classifiers [J].
Artelt, Andre ;
Hammer, Barbara .
NEUROCOMPUTING, 2022, 470 :304-317
[8]   Efficient computation of contrastive explanations [J].
Artelt, Andre ;
Hammer, Barbara .
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[9]   Bridging the Gap Between Ethics and Practice: Guidelines for Reliable, Safe, and Trustworthy Human-centered AI Systems [J].
Ben Shneiderman .
ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2020, 10 (04)
[10]  
Botari T., 2020, Melime: Meaningful local explanation for machine learning models