Interpreting Adversarial Examples in Deep Learning: A Review

被引:29
作者
Han, Sicong [1 ]
Lin, Chenhao [1 ]
Shen, Chao [1 ]
Wang, Qian [2 ]
Guan, Xiaohong [1 ]
机构
[1] Xi An Jiao Tong Univ, 28 Xianning West Rd, Xian 710049, Shaanxi, Peoples R China
[2] Wuhan Univ, 299 Bayi Rd, Wuhan 430072, Hubei, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Deep learning; adversarial example; interpretability; adversarial robustness;
D O I
10.1145/3594869
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Deep learning technology is increasingly being applied in safety-critical scenarios but has recently been found to be susceptible to imperceptible adversarial perturbations. This raises a serious concern regarding the adversarial robustness of deep neural network (DNN)-based applications. Accordingly, various adversarial attacks and defense approaches have been proposed. However, current studies implement different types of attacks and defenses with certain assumptions. There is still a lack of full theoretical understanding and interpretation of adversarial examples. Instead of reviewing technical progress in adversarial attacks and defenses, this article presents a framework consisting of three perspectives to discuss recent works focusing on theoretically explaining adversarial examples comprehensively. In each perspective, various hypotheses are further categorized and summarized into several subcategories and introduced systematically. To the best of our knowledge, this study is the first to concentrate on surveying existing research on adversarial examples and adversarial robustness from the interpretability perspective. By drawing on the reviewed literature, this survey characterizes current problems and challenges that need to be addressed and highlights potential future research directions to further investigate adversarial examples.
引用
收藏
页数:38
相关论文
共 159 条
  • [1] Adversarial Example Detection Using Latent Neighborhood Graph
    Abusnaina, Ahmed
    Wu, Yuhang
    Arora, Sunpreet
    Wang, Yizhen
    Wang, Fei
    Yang, Hao
    Mohaisen, David
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7667 - 7676
  • [2] Agarwal C, 2019, IEEE IMAGE PROC, P3801, DOI [10.1109/ICIP.2019.8803601, 10.1109/icip.2019.8803601]
  • [3] Defense against Universal Adversarial Perturbations
    Akhtar, Naveed
    Liu, Jian
    Mian, Ajmal
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 3389 - 3398
  • [4] Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey
    Akhtar, Naveed
    Mian, Ajmal
    [J]. IEEE ACCESS, 2018, 6 : 14410 - 14430
  • [5] [Anonymous], 2019, INT C MACHINE LEARNI
  • [6] Athalye A, 2018, PR MACH LEARN RES, V80
  • [7] Awasthi P, 2021, Arxiv, DOI arXiv:2104.09658
  • [8] Ba JL., 2016, arXiv
  • [9] Bai Y, 2022, Arxiv, DOI arXiv:2103.08307
  • [10] Bartlett Peter L., 2021, ARXIV