Generalizing Adversarial Explanations with Grad-CAM

被引：8

作者：

Chakraborty, Tanmay ^{[1
]}

Trehan, Utkarsh ^{[1
]}

Mallat, Khawla ^{[1
,2
]}

Dugelay, Jean-Luc ^{[1
]}

机构：

[1] EURECOM, Campus SophiaTech,450 Route Chappes, F-06410 Biot, France

[2] SAP Secur Res Labs France, Biot, France

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022 | 2022年

关键词：

D O I：

10.1109/CVPRW56347.2022.00031

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Gradient-weighted Class Activation Mapping (Grad-CAM), is an example-based explanation method that provides a gradient activation heat map as an explanation for Convolution Neural Network (CNN) models. The drawback of this method is that it cannot be used to generalize CNN behaviour. In this paper, we present a novel method that extends Grad-CAM from example-based explanations to a method for explaining global model behaviour. This is achieved by introducing two new metrics, (i) Mean Observed Dissimilarity (MOD) and (ii) Variation in Dissimilarity (VID), for model generalization. These metrics are computed by comparing a Normalized Inverted Structural Similarity Index (NISSIM) metric of the Grad-CAM generated heatmap for samples from the original test set and samples from the adversarial test set. For our experiment, we study adversarial attacks on deep models such as VGG16, ResNet50, and ResNet101, and wide models such as InceptionNetv3 and XceptionNet using Fast Gradient Sign Method (FGSM). We then compute the metrics MOD and VID for the automatic face recognition (AFR) use case with the VGGFace2 dataset. We observe a consistent shift in the region highlighted in the Grad-CAM heatmap, reflecting its participation to the decision making, across all models under adversarial attacks. The proposed method can be used to understand adversarial attacks and explain the behaviour of black box CNN models for image analysis.

引用

页码：186 / 192

页数：7

共 27 条

[1]

Alparslan Y., 2020, arXiv

[2] Adversarial attacks through architectures and spectra in face recognition [J].

Bisogni, Carmen ;

Cascone, Lucia ;

Dugelay, Jean-Luc ;

Pero, Chiara .

PATTERN RECOGNITION LETTERS, 2021, 147 :55-62

[3] VGGFace2: A dataset for recognising faces across pose and age [J].

Cao, Qiong ;

Shen, Li ;

Xie, Weidi ;

Parkhi, Omkar M. ;

Zisserman, Andrew .

PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, :67-74

[4]

Chollet Francois, 2017, XCEPTION DEEP LEARNI, P3

[5]

Cantareira GD, 2021, Arxiv, DOI arXiv:2103.10229

[6]

Demontis A, 2019, PROCEEDINGS OF THE 28TH USENIX SECURITY SYMPOSIUM, P321

[7]

Dhaliwal J, 2018, Arxiv, DOI arXiv:1806.10707

[8] When Explainability Meets Adversarial Learning: Detecting Adversarial Examples using SHAP Signatures [J].

Fidel, Gil ;

Bitton, Ron ;

Shabtai, Asaf .

2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,

[9]

Gao Ruize, 2021, P MACHINE LEARNING R, V139

[10]

Grosse K, 2017, Arxiv, DOI arXiv:1702.06280

← 1 2 3 →