Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance

被引:0
作者
Crabbe, Jonathan [1 ]
van der Schaar, Mihaela [1 ]
机构
[1] Univ Cambridge, DAMTP, Cambridge, England
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
关键词
BLACK-BOX;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Interpretability methods are valuable only if their explanations faithfully describe the explained model. In this work, we consider neural networks whose predictions are invariant under a specific symmetry group. This includes popular architectures, ranging from convolutional to graph neural networks. Any explanation that faithfully explains this type of model needs to be in agreement with this invariance property. We formalize this intuition through the notion of explanation invariance and equivariance by leveraging the formalism from geometric deep learning. Through this rigorous formalism, we derive (1) two metrics to measure the robustness of any interpretability method with respect to the model symmetry group; (2) theoretical robustness guarantees for some popular interpretability methods and (3) a systematic approach to increase the invariance of any interpretability method with respect to a symmetry group. By empirically measuring our metrics for explanations of models associated with various modalities and symmetry groups, we derive a set of 5 guidelines to allow users and developers of interpretability methods to produce robust explanations.
引用
收藏
页数:37
相关论文
共 96 条
[1]   Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) [J].
Adadi, Amina ;
Berrada, Mohammed .
IEEE ACCESS, 2018, 6 :52138-52160
[2]   On Smoothed Explanations: Quality and Robustness [J].
Ajalloeian, Ahmad ;
Moosavi-Dezfooli, Seyed Mohsen ;
Vlachos, Michalis ;
Frossard, Pascal .
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, :15-25
[3]  
Alvarez-Melis D, 2018, Arxiv, DOI arXiv:1806.08049
[4]  
[Anonymous], 2015, P IEEE C COMPUTER VI, DOI DOI 10.1109/CVPR.2015.7298801
[5]  
[Anonymous], 2019, ADV NEUR IN
[6]  
[Anonymous], 2018, PMLR
[7]  
Arya V, 2019, Arxiv, DOI [arXiv:1909.03012, DOI 10.48550/ARXIV.1909.03012]
[8]   Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI [J].
Barredo Arrieta, Alejandro ;
Diaz-Rodriguez, Natalia ;
Del Ser, Javier ;
Bennetot, Adrien ;
Tabik, Siham ;
Barbado, Alberto ;
Garcia, Salvador ;
Gil-Lopez, Sergio ;
Molina, Daniel ;
Benjamins, Richard ;
Chatila, Raja ;
Herrera, Francisco .
INFORMATION FUSION, 2020, 58 :82-115
[9]   E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials [J].
Batzner, Simon ;
Musaelian, Albert ;
Sun, Lixin ;
Geiger, Mario ;
Mailoa, Jonathan P. ;
Kornbluth, Mordechai ;
Molinari, Nicola ;
Smidt, Tess E. ;
Kozinsky, Boris .
NATURE COMMUNICATIONS, 2022, 13 (01)
[10]  
Bhatt U., 2020, arXiv