Explainable image classification with evidence counterfactual

被引:50
作者
Vermeire, Tom [1 ]
Brughmans, Dieter [1 ]
Goethals, Sofie [1 ]
de Oliveira, Raphael Mazzine Barbossa [1 ]
Martens, David [1 ]
机构
[1] Univ Antwerp, Prinsstr 13, B-2000 Antwerp, Belgium
关键词
Image classification; Counterfactual explanation; Explainable artificial intelligence; Search algorithms;
D O I
10.1007/s10044-021-01055-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The complexity of state-of-the-art modeling techniques for image classification impedes the ability to explain model predictions in an interpretable way. A counterfactual explanation highlights the parts of an image which, when removed, would change the predicted class. Both legal scholars and data scientists are increasingly turning to counterfactual explanations as these provide a high degree of human interpretability, reveal what minimal information needs to be changed in order to come to a different prediction and do not require the prediction model to be disclosed. Our literature review shows that existing counterfactual methods for image classification have strong requirements regarding access to the training data and the model internals, which often are unrealistic. Therefore, SEDC is introduced as a model-agnostic instance-level explanation method for image classification that does not need access to the training data. As image classification tasks are typically multiclass problems, an additional contribution is the introduction of the SEDC-T method that allows specifying a target counterfactual class. These methods are experimentally tested on ImageNet data, and with concrete examples, we illustrate how the resulting explanations can give insights in model decisions. Moreover, SEDC is benchmarked against existing model-agnostic explanation methods, demonstrating stability of results, computational efficiency and the counterfactual nature of the explanations.
引用
收藏
页码:315 / 335
页数:21
相关论文
共 52 条
[1]   SLIC Superpixels Compared to State-of-the-Art Superpixel Methods [J].
Achanta, Radhakrishna ;
Shaji, Appu ;
Smith, Kevin ;
Lucchi, Aurelien ;
Fua, Pascal ;
Suesstrunk, Sabine .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (11) :2274-2281
[2]   Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) [J].
Adadi, Amina ;
Berrada, Mohammed .
IEEE ACCESS, 2018, 6 :52138-52160
[3]  
Akula AR, 2020, AAAI CONF ARTIF INTE, V34, P2594
[4]  
Alvarez-Melis D., 2018, On the robustness of interpretability methods
[5]   On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation [J].
Bach, Sebastian ;
Binder, Alexander ;
Montavon, Gregoire ;
Klauschen, Frederick ;
Mueller, Klaus-Robert ;
Samek, Wojciech .
PLOS ONE, 2015, 10 (07)
[6]   A Framework and Benchmarking Study for Counterfactual Generating Methods on Tabular Data [J].
Barbosa de Oliveira, Raphael Mazzine ;
Martens, David .
APPLIED SCIENCES-BASEL, 2021, 11 (16)
[7]  
Barocas, 2019, ARXIV PREPRINT ARXIV
[8]   Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI [J].
Barredo Arrieta, Alejandro ;
Diaz-Rodriguez, Natalia ;
Del Ser, Javier ;
Bennetot, Adrien ;
Tabik, Siham ;
Barbado, Alberto ;
Garcia, Salvador ;
Gil-Lopez, Sergio ;
Molina, Daniel ;
Benjamins, Richard ;
Chatila, Raja ;
Herrera, Francisco .
INFORMATION FUSION, 2020, 58 :82-115
[9]  
BERTALMIO M, 2001, PROC CVPR IEEE, P355, DOI DOI 10.1109/CVPR.2001.990497
[10]  
Brughmans, 2021, ARXIV PREPRINT ARXIV