Saliency Map-Based Local White-Box Adversarial Attack Against Deep Neural Networks

被引：1

作者：

Liu, Haohan ^{[1
,2
]}

Zuo, Xingquan ^{[1
,2
]}

Huang, Hai ^{[1
]}

Wan, Xing ^{[1
,2
]}

机构：

[1] Beijing Univ Posts & Telecommun, Sch Comp Sci, Beijing, Peoples R China

[2] Minist Educ, Key Lab Trustworthy Distributed Comp & Serv, Beijing, Peoples R China

来源：

ARTIFICIAL INTELLIGENCE, CICAI 2022, PT II | 2022年 / 13605卷

关键词：

Deep learning; Saliency map; Local white-box attack; Adversarial attack;

D O I：

10.1007/978-3-031-20500-2_1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The current deep neural networks (DNN) are easily fooled by adversarial examples, which are generated by adding some small, well-designed and human-imperceptible perturbations to clean examples. Adversarial examples will mislead deep learning (DL) model to make wrong predictions. At present, many existing white-box attack methods in the image field are mainly based on the global gradient of the model. That is, the global gradient is first calculated, and then the perturbation is added into the gradient direction. Those methods usually have a high attack success rate. However, there are also some shortcomings, such as excessive perturbation and easy detection by the human's eye. Therefore, in this paper we propose a SaliencyMap-based Local white-box Adversarial Attack method (SMLAA). The saliencymap used in the interpretability of artificial intelligence is introduced in SMLAA. First, Gradient-weighted Class Activation Mapping (Grad-CAM) is utilized to provide a visual interpretation of model decisions to find important areas in an image. Then, the perturbation is added only to important local areas to reduce the magnitude of perturbations. Experimental results show that compared with the global attack method, SMLAA reduces the average robustness measure by 9%-24% while ensuring the attack success rate. It means that SMLAA has a high attack success rate with fewer pixels changed.

引用

页码：3 / 14

页数：12

共 21 条

[1] Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey
Akhtar, Naveed
Mian, Ajmal
[J]. IEEE ACCESS, 2018, 6 : 14410 - 14430
[2] Baehrens D, 2010, J MACH LEARN RES, V11, P1803
[3] Dong Xiaoyi, 2020, P IEEECVF C COMPUTER, P12895
[4] Boosting Adversarial Attacks with Momentum
Dong, Yinpeng
Liao, Fangzhou
Pang, Tianyu
Su, Hang
Zhu, Jun
Hu, Xiaolin
Li, Jianguo
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 9185 - 9193
[5] Mask-guided noise restriction adversarial attacks for image classification
Duan, Yexin
Zhou, Xingyu
Zou, Junhua
Qiu, Junyang
Zhang, Jin
Pan, Zhisong
[J]. COMPUTERS & SECURITY, 2021, 100
[6] A Survey of Methods for Explaining Black Box Models
Guidotti, Riccardo
Monreale, Anna
Ruggieri, Salvatore
Turin, Franco
Giannotti, Fosca
Pedreschi, Dino
[J]. ACM COMPUTING SURVEYS, 2019, 51 (05)
[7] He Y., 2021, IEEE INT C COMPUT VI, P16062
[8] AdvFilter: Predictive Perturbation-aware Filtering against Adversarial Attack via Multi-domain Learning
Huang, Yihao
Guo, Qing
Juefei-Xu, Felix
Ma, Lei
Miao, Weikai
Liu, Yang
Pu, Geguang
[J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 395 - 403
[9] Goodfellow IJ, 2015, Arxiv, DOI [arXiv:1412.6572, 10.48550/arXiv.1412.6572]
[10] Adversarial Examples versus Cloud-Based Detectors: A Black-Box Empirical Study
Li, Xurong
Ji, Shouling
Han, Meng
Ji, Juntao
Ren, Zhenyu
Liu, Yushan
Wu, Chunming
[J]. IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2021, 18 (04) : 1933 - 1949

← 1 2 3 →