Adversarial example detection based on saliency map features

被引:11
作者
Wang, Shen [1 ]
Gong, Yuxin [1 ]
机构
[1] Harbin Inst Technol, Harbin, Peoples R China
关键词
Machine learning; Adversarial example detection; Interpretability; Saliency map;
D O I
10.1007/s10489-021-02759-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, machine learning has greatly improved image recognition capability. However, studies have shown that neural network models are vulnerable to adversarial examples that make models output wrong answers with high confidence. To understand the vulnerabilities of models, we use interpretability methods to reveal the internal decision-making behaviors of models. Interpretation results reflect that the evolutionary process of nonnormalized saliency maps between clean and adversarial examples are increasingly differentiated along model hidden layers. By taking advantage of this phenomenon, we propose an adversarial example detection method based on multilayer saliency features, which can comprehensively capture the abnormal characteristics of adversarial example interpretations. Experimental results show that the proposed method can effectively detect adversarial examples based on gradient, optimization and black-box attacks, and it is comparable with the state-of-the-art methods.
引用
收藏
页码:6262 / 6275
页数:14
相关论文
共 50 条
  • [41] Object Detection Method Based on Saliency Map Fusion for UAV-borne Thermal Images
    Zhao X.-K.
    Li M.
    Zhang G.
    Li N.
    Li J.-S.
    [J]. Zidonghua Xuebao/Acta Automatica Sinica, 2021, 47 (09): : 2120 - 2131
  • [42] A Saliency Detection Model Using Low-Level Features Based on Wavelet Transform
    Imamoglu, Nevrez
    Lin, Weisi
    Fang, Yuming
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2013, 15 (01) : 96 - 105
  • [43] Adversarial Example Detection with Latent Representation Dynamic Prototype
    Wang, Taowen
    Qian, Zhuang
    Yang, Xi
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2023, PT IV, 2024, 14450 : 525 - 536
  • [44] Adversarial example denoising and detection based on the consistency between Fourier-transformed layers
    Jung, Seunghwan
    Kim, Heeyeon
    Chung, Minyoung
    Shin, Yeong-Gil
    [J]. NEUROCOMPUTING, 2024, 606
  • [45] An interpretability security framework for intelligent decision support systems based on saliency map
    Zhang, Denghui
    Gu, Zhaoquan
    Ren, Lijing
    Shafiq, Muhammad
    [J]. INTERNATIONAL JOURNAL OF INFORMATION SECURITY, 2023, 22 (05) : 1249 - 1260
  • [46] Multi-Modal Adversarial Example Detection with Transformer
    Ding, Chaoyue
    Sun, Shiliang
    Zhao, Jing
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [47] Robust Adversarial Example Detection Algorithm Based on High-Level Feature Differences
    Mu, Hua
    Li, Chenggang
    Peng, Anjie
    Wang, Yangyang
    Liang, Zhenyu
    [J]. Sensors, 2025, 25 (06)
  • [48] An interpretability security framework for intelligent decision support systems based on saliency map
    Denghui Zhang
    Zhaoquan Gu
    Lijing Ren
    Muhammad Shafiq
    [J]. International Journal of Information Security, 2023, 22 : 1249 - 1260
  • [49] Objective Image Quality Assessment Based on Saliency Map
    Wei, Longsheng
    Liu, Wei
    Wang, Xinmei
    Liu, Feng
    Luo, Dapeng
    [J]. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2016, 20 (02) : 205 - 211
  • [50] Saliency Map Generation Based on Human Level Performance
    Amini, Ehsan
    Javadi, Saleh
    Khatibi, Siamak
    [J]. 2024 IEEE GAMING, ENTERTAINMENT, AND MEDIA CONFERENCE, GEM 2024, 2024, : 607 - 611