Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense

被引:12
|
作者
Liu, Deyin [1 ]
Wu, Lin Yuanbo [2 ]
Li, Bo [3 ]
Boussaid, Farid [4 ]
Bennamoun, Mohammed [4 ]
Xie, Xianghua [2 ]
Liang, Chengwu [5 ]
机构
[1] Anhui Univ, Sch Artificial Intelligence, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Anhui, Peoples R China
[2] Swansea Univ, Dept Comp Sci, Swansea SA1 8EN, Wales
[3] Northwestern Polytech Univ, Xian 710072, Shaanxi, Peoples R China
[4] Univ Western Australia, Perth, WA 6009, Australia
[5] Henan Univ Urban Construct, Pingdingshan 467036, Henan, Peoples R China
关键词
Selective input gradient regularization; Jacobian normalization; Adversarial robustness;
D O I
10.1016/j.patcog.2023.109902
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNNs) can be easily deceived by imperceptible alterations known as adversarial examples. These examples can lead to misclassification, posing a significant threat to the reliability of deep learning systems in real-world applications. Adversarial training (AT) is a popular technique used to enhance robustness by training models on a combination of corrupted and clean data. However, existing AT-based methods often struggle to handle transferred adversarial examples that can fool multiple defense models, thereby falling short of meeting the generalization requirements for real-world scenarios. Furthermore, AT typically fails to provide interpretable predictions, which are crucial for domain experts seeking to understand the behavior of DNNs. To overcome these challenges, we present a novel approach called Jacobian norm and Selective Input Gradient Regularization (J-SIGR). Our method leverages Jacobian normalization to improve robustness and introduces regularization of perturbation-based saliency maps, enabling interpretable predictions. By adopting J-SIGR, we achieve enhanced defense capabilities and promote high interpretability of DNNs. We evaluate the effectiveness of J-SIGR across various architectures by subjecting it to powerful adversarial attacks. Our experimental evaluations provide compelling evidence of the efficacy of J-SIGR against transferred adversarial attacks, while preserving interpretability. The project code can be found at https://github.com/Lywu-github/jJ-SIGR.git.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Ensemble of Predictions from Augmented Input as Adversarial Defense for Face Verification System
    Kurnianggoro, Laksono
    Jo, Kang-Hyun
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2019, PT II, 2019, 11432 : 658 - 669
  • [32] A PROJECTED GRADIENT ALGORITHM FOR IMAGE RESTORATION UNDER HESSIAN MATRIX-NORM REGULARIZATION
    Lefkimmiatis, Stamatios
    Unser, Michael
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 3029 - 3032
  • [33] Selective Adversarial Adaptation Learning via Exclusive Regularization for Partial Domain Adaptation
    Li, Ping
    Shen, Linlin
    Ling, Hefei
    Wu, Lei
    Wang, Qian
    Zhao, Chuang
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [34] Projected Gradient Descent Adversarial Attack and Its Defense on a Fault Diagnosis System
    Ayas, Mustafa Sinasi
    Ayas, Selen
    Djouadi, Seddik M.
    2022 45TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING, TSP, 2022, : 36 - 39
  • [35] COMBATING FALSE SENSE OF SECURITY: BREAKING THE DEFENSE OF ADVERSARIAL TRAINING VIA NON-GRADIENT ADVERSARIAL ATTACK
    Fan, Mingyuan
    Liu, Yang
    Chen, Cen
    Yu, Shengxing
    Guo, Wenzhong
    Liu, Ximeng
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3293 - 3297
  • [36] Application of an improved LightGBM hybrid integration model combining gradient harmonization and Jacobian regularization for breast cancer diagnosis
    Sun, Xiaoyan
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [37] Application of L1-norm regularization to epicardial potential reconstruction based on gradient projection
    Wang, Liansheng
    Qin, Jing
    Wong, Tien Tsin
    Heng, Pheng Ann
    PHYSICS IN MEDICINE AND BIOLOGY, 2011, 56 (19): : 6291 - 6310
  • [38] Seismic impedance inversion using l1-norm regularization and gradient descent methods
    Wang, Yanfei
    JOURNAL OF INVERSE AND ILL-POSED PROBLEMS, 2010, 18 (07): : 823 - 838
  • [39] DefenseFea: An Input Transformation Feature Searching Algorithm Based Latent Space for Adversarial Defense
    Pan, Zhang
    Cao, Yangjie
    Zhu, Chenxi
    Yan, Zhuang
    Wang, Haobo
    Jie, Li
    FOUNDATIONS OF COMPUTING AND DECISION SCIENCES, 2024, 49 (01) : 21 - 36
  • [40] A robust defense for spiking neural networks against adversarial examples via input filtering
    Guo, Shasha
    Wang, Lei
    Yang, Zhijie
    Lu, Yuliang
    JOURNAL OF SYSTEMS ARCHITECTURE, 2024, 153