Benchmarking the influence of pre-training on explanation performance in MR image classification

被引：4

作者：

Oliveira, Marta ^{[1
]}

Wilming, Rick ^{[2
]}

Clark, Benedict ^{[1
]}

Budding, Celine ^{[3
]}

Eitel, Fabian ^{[3
]}

Ritter, Kerstin ^{[3
]}

Haufe, Stefan ^{[1
,2
,3
]}

机构：

[1] Phys Tech Bundesanstalt, Div 8 44, Berlin, Germany

[2] Tech Univ Berlin, Dept Comp Sci, Berlin, Germany

[3] Charite Univ Med Berlin, Berlin Ctr Adv Neuroimaging BCAN, Berlin, Germany

来源：

FRONTIERS IN ARTIFICIAL INTELLIGENCE | 2024年 / 7卷

基金：

欧洲研究理事会;

关键词：

XAI; explainability; interpretability; pre-training; MRI; benchmark; dataset; classification; MODELS;

D O I：

10.3389/frai.2024.1330919

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Convolutional Neural Networks (CNNs) are frequently and successfully used in medical prediction tasks. They are often used in combination with transfer learning, leading to improved performance when training data for the task are scarce. The resulting models are highly complex and typically do not provide any insight into their predictive mechanisms, motivating the field of "explainable" artificial intelligence (XAI). However, previous studies have rarely quantitatively evaluated the "explanation performance" of XAI methods against ground-truth data, and transfer learning and its influence on objective measures of explanation performance has not been investigated. Here, we propose a benchmark dataset that allows for quantifying explanation performance in a realistic magnetic resonance imaging (MRI) classification task. We employ this benchmark to understand the influence of transfer learning on the quality of explanations. Experimental results show that popular XAI methods applied to the same underlying model differ vastly in performance, even when considering only correctly classified examples. We further observe that explanation performance strongly depends on the task used for pre-training and the number of CNN layers pre-trained. These results hold after correcting for a substantial correlation between explanation and classification performance.

引用

页数：10

共 45 条

[1]

Agarwal C, 2022, Arxiv, DOI arXiv:2206.11104

[2] Transfer Learning Approaches for Neuroimaging Analysis: A Scoping Review [J].

Ardalan, Zaniar ;

Subbian, Vignesh .

FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2022, 5

[3] CLEVR-XAI: A benchmark dataset for the ground truth evaluation of neural network explanations [J].

Arras, Leila ;

Osman, Ahmed ;

Samek, Wojciech .

INFORMATION FUSION, 2022, 81 :14-40

[4] On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation [J].

Bach, Sebastian ;

Binder, Alexander ;

Montavon, Gregoire ;

Klauschen, Frederick ;

Mueller, Klaus-Robert ;

Samek, Wojciech .

PLOS ONE, 2015, 10 (07)

[5] Transfer Learning with Convolutional Neural Networks for Classification of Abdominal Ultrasound Images [J].

Cheng, Phillip M. ;

Malhi, Harshawn S. .

JOURNAL OF DIGITAL IMAGING, 2017, 30 (02) :234-243

[6]

Cherti M., 2021, Effect of Pre-Training Scale on Intra- and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images, P1

[7]

Clark B, 2023, Arxiv, DOI arXiv:2306.12816

[8] REVISED DEFINITION FOR SUPPRESSOR VARIABLES - GUIDE TO THEIR IDENTIFICATION AND INTERPRETATION [J].