Adversarial attacks and defenses using feature-space stochasticity

被引：4

作者：

Ukita, Jumpei ^{[1
]}

Ohki, Kenichi ^{[1
,2
,3
]}

机构：

[1] Univ Tokyo, Sch Med, Dept Physiol, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1130033, Japan

[2] Int Res Ctr Neurointelligence WPI IRCN, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1130033, Japan

[3] Inst AI & Beyond, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1130033, Japan

来源：

NEURAL NETWORKS | 2023年 / 167卷

关键词：

Adversarial attack; Adversarial defense; Feature smoothing; DEEP NEURAL-NETWORKS;

D O I：

10.1016/j.neunet.2023.08.022

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent studies in deep neural networks have shown that injecting random noise in the input layer of the networks contributes towards tp-norm-bounded adversarial perturbations. However, to defend against unrestricted adversarial examples, most of which are not tp-norm-bounded in the input layer, such input-layer random noise may not be sufficient. In the first part of this study, we generated a novel class of unrestricted adversarial examples termed feature-space adversarial examples. These examples are far from the original data in the input space but adjacent to the original data in a hiddenlayer feature space and far again in the output layer. In the second part of this study, we empirically showed that while injecting random noise in the input layer was unable to defend these feature-space adversarial examples, they were defended by injecting random noise in the hidden layer. These results highlight the novel benefit of stochasticity in higher layers, in that it is useful for defending against these feature-space adversarial examples, a class of unrestricted adversarial examples. (c) 2023 Elsevier Ltd. All rights reserved.

引用

页码：875 / 889

页数：15

共 82 条

[1] Agarwal A., 2016, arXiv, DOI DOI 10.48550/ARXIV.1603.04467
[2] Agarwal C, 2019, IEEE IMAGE PROC, P3801, DOI [10.1109/ICIP.2019.8803601, 10.1109/icip.2019.8803601]
[3] Athalye A, 2018, PR MACH LEARN RES, V80
[4] Brown TB, 2018, Arxiv, DOI arXiv:1809.08352
[5] Bai Y., 2021, 9 INT C LEARN REPR I
[6] A map of object space in primate inferotemporal cortex
Bao, Pinglei
She, Liang
McGill, Mason
Tsao, Doris Y.
[J]. NATURE, 2020, 583 (7814) : 103 - +
[7] Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification
Cao, Xiaoyu
Gong, Neil Zhenqiang
[J]. 33RD ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2017), 2017, : 278 - 287
[8] Towards Evaluating the Robustness of Neural Networks
Carlini, Nicholas
Wagner, David
[J]. 2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, : 39 - 57
[9] Audio Adversarial Examples: Targeted Attacks on Speech-to-Text
Carlini, Nicholas
Wagner, David
[J]. 2018 IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (SPW 2018), 2018, : 1 - 7
[10] The Code for Facial Identity in the Primate Brain
Chang, Le
Tsao, Doris Y.
[J]. CELL, 2017, 169 (06) : 1013 - +

← 1 2 3 4 5 6 7 8 9 →