Adversarial attacks and defenses using feature-space stochasticity

被引:4
作者
Ukita, Jumpei [1 ]
Ohki, Kenichi [1 ,2 ,3 ]
机构
[1] Univ Tokyo, Sch Med, Dept Physiol, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1130033, Japan
[2] Int Res Ctr Neurointelligence WPI IRCN, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1130033, Japan
[3] Inst AI & Beyond, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1130033, Japan
关键词
Adversarial attack; Adversarial defense; Feature smoothing; DEEP NEURAL-NETWORKS;
D O I
10.1016/j.neunet.2023.08.022
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent studies in deep neural networks have shown that injecting random noise in the input layer of the networks contributes towards tp-norm-bounded adversarial perturbations. However, to defend against unrestricted adversarial examples, most of which are not tp-norm-bounded in the input layer, such input-layer random noise may not be sufficient. In the first part of this study, we generated a novel class of unrestricted adversarial examples termed feature-space adversarial examples. These examples are far from the original data in the input space but adjacent to the original data in a hiddenlayer feature space and far again in the output layer. In the second part of this study, we empirically showed that while injecting random noise in the input layer was unable to defend these feature-space adversarial examples, they were defended by injecting random noise in the hidden layer. These results highlight the novel benefit of stochasticity in higher layers, in that it is useful for defending against these feature-space adversarial examples, a class of unrestricted adversarial examples. (c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:875 / 889
页数:15
相关论文
共 82 条
  • [31] Learn2Perturb: an End-to-end Feature Perturbation Learning to Improve Adversarial Robustness
    Jeddi, Ahmadreza
    Shafiee, Mohammad Javad
    Karg, Michelle
    Scharfenberger, Christian
    Wong, Alexander
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1238 - 1247
  • [32] Jeong J., 2020, Advances in Neural Information Processing Systems
  • [33] Jia J., 2020, INT C LEARN REPR
  • [34] Semantic Adversarial Attacks: Parametric Transformations That Fool Deep Classifiers
    Joshi, Ameya
    Mukherjee, Amitangshu
    Sarkar, Soumik
    Hegde, Chinmay
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4772 - 4782
  • [35] Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation
    Khaligh-Razavi, Seyed-Mahdi
    Kriegeskorte, Nikolaus
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2014, 10 (11)
  • [36] Kingma DP, 2015, PROC ICLR
  • [37] ImageNet Classification with Deep Convolutional Neural Networks
    Krizhevsky, Alex
    Sutskever, Ilya
    Hinton, Geoffrey E.
    [J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90
  • [38] Krizhevsky Alex, 2009, Tech. Rep. 0
  • [39] Kumari N, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2779
  • [40] Certified Robustness to Adversarial Examples with Differential Privacy
    Lecuyer, Mathias
    Atlidakis, Vaggelis
    Geambasu, Roxana
    Hsu, Daniel
    Jana, Suman
    [J]. 2019 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2019), 2019, : 656 - +