Boosting adversarial attacks with transformed gradient

被引：16

作者：

He, Zhengyun ^{[1
,2
]}

Duan, Yexin ^{[1
,3
]}

Zhang, Wu ^{[1
]}

Zou, Junhua ^{[1
]}

He, Zhengfang ^{[5
]}

Wang, Yunyun ^{[4
]}

Pan, Zhisong ^{[1
]}

机构：

[1] Army Engn Univ PLA, Command & Control Engn Coll, Nanjing, Peoples R China

[2] Hunan Univ Technol, Railway Transportat Coll, Zhuzhou, Peoples R China

[3] Army Mil Transportat Univ PLA, Zhenjiang Campus, Zhenjiang, Peoples R China

[4] Nanjing Univ Posts & Telecommun, Nanjing, Peoples R China

[5] Zhejiang Uniview Technol Co Ltd, Hangzhou, Peoples R China

来源：

COMPUTERS & SECURITY | 2022年 / 118卷

关键词：

Deep neural networks; Image classification; Adversarial examples; Adversarial machine learning; Adversarial attack; Transferability;

D O I：

10.1016/j.cose.2022.102720

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep neural networks (DNNs) are vulnerable to adversarial examples, which are crafted by adding imperceptible perturbations to benign examples. Increasing the attack success rate usually requires a larger noise magnitude, which leads to noticeable noise. To this end, we propose a Transformed Gradient method (TG), which achieves a higher attack success rate with lower perturbations against the target model, i.e. an ensemble of black-box defense models. It consists of three steps: original gradient accumulation, gradient amplification, and gradient truncation. Besides, we introduce the Fr e ' chet Inception Distance (FID) and Learned Perceptual Image Patch Similarity (LPIPS) respectively to evaluate fidelity and perceived distance from the original example, which is more comprehensive than only using L infinity norm as evaluation metrics. Furthermore, we propose optimizing coefficients of the source-model ensemble to improve adversarial attacks. Extensive experimental results demonstrate that the perturbations of adversarial examples generated by our proposed method are less than the state-of-the-art baselines, namely MI, DI, TI, RF-DE based on vanilla iterative FGSM and their combinations. Compared with the baseline method, the average black-box attack success rate and total score are improved by 6.6% and 13.8, respectively. We make our codes public at Github https://github.com/Hezhengyun/Transformed-Gradient . (c) 2022 Elsevier Ltd. All rights reserved.

引用

页数：12

共 48 条

[1]

Akhtar N., 2021, Advances in adversarial attacks and defenses in computer vision: A survey

[2] Towards Evaluating the Robustness of Neural Networks [J].

Carlini, Nicholas ;

Wagner, David .

2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, :39-57

[3] Xception: Deep Learning with Depthwise Separable Convolutions [J].

Chollet, Francois .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807

[4]

Cohen J, 2019, PR MACH LEARN RES, V97

[5]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[6] Benchmarking Adversarial Robustness on Image Classification [J].

Dong, Yinpeng ;

Fu, Qi-An ;

Yang, Xiao ;

Pang, Tianyu ;

Su, Hang ;

Xiao, Zihao ;

Zhu, Jun .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :318-328

[7] Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks [J].

Dong, Yinpeng ;

Pang, Tianyu ;

Su, Hang ;

Zhu, Jun .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4307-4316

[8] Boosting Adversarial Attacks with Momentum [J].

Dong, Yinpeng ;

Liao, Fangzhou ;

Pang, Tianyu ;

Su, Hang ;

Zhu, Jun ;

Hu, Xiaolin ;

Li, Jianguo .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :9185-9193

[9] Mask-guided noise restriction adversarial attacks for image classification [J].

Duan, Yexin ;

Zhou, Xingyu ;

Zou, Junhua ;

Qiu, Junyang ;

Zhang, Jin ;

Pan, Zhisong .

COMPUTERS & SECURITY, 2021, 100

[10]

Dziugaite G. K., 2016, arXiv preprint arXiv:1608.00853

← 1 2 3 4 5 →