Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense

被引:12
|
作者
Liu, Deyin [1 ]
Wu, Lin Yuanbo [2 ]
Li, Bo [3 ]
Boussaid, Farid [4 ]
Bennamoun, Mohammed [4 ]
Xie, Xianghua [2 ]
Liang, Chengwu [5 ]
机构
[1] Anhui Univ, Sch Artificial Intelligence, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Anhui, Peoples R China
[2] Swansea Univ, Dept Comp Sci, Swansea SA1 8EN, Wales
[3] Northwestern Polytech Univ, Xian 710072, Shaanxi, Peoples R China
[4] Univ Western Australia, Perth, WA 6009, Australia
[5] Henan Univ Urban Construct, Pingdingshan 467036, Henan, Peoples R China
关键词
Selective input gradient regularization; Jacobian normalization; Adversarial robustness;
D O I
10.1016/j.patcog.2023.109902
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNNs) can be easily deceived by imperceptible alterations known as adversarial examples. These examples can lead to misclassification, posing a significant threat to the reliability of deep learning systems in real-world applications. Adversarial training (AT) is a popular technique used to enhance robustness by training models on a combination of corrupted and clean data. However, existing AT-based methods often struggle to handle transferred adversarial examples that can fool multiple defense models, thereby falling short of meeting the generalization requirements for real-world scenarios. Furthermore, AT typically fails to provide interpretable predictions, which are crucial for domain experts seeking to understand the behavior of DNNs. To overcome these challenges, we present a novel approach called Jacobian norm and Selective Input Gradient Regularization (J-SIGR). Our method leverages Jacobian normalization to improve robustness and introduces regularization of perturbation-based saliency maps, enabling interpretable predictions. By adopting J-SIGR, we achieve enhanced defense capabilities and promote high interpretability of DNNs. We evaluate the effectiveness of J-SIGR across various architectures by subjecting it to powerful adversarial attacks. Our experimental evaluations provide compelling evidence of the efficacy of J-SIGR against transferred adversarial attacks, while preserving interpretability. The project code can be found at https://github.com/Lywu-github/jJ-SIGR.git.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Scaleable input gradient regularization for adversarial robustness
    Finlay, Chris
    Oberman, Adam M.
    MACHINE LEARNING WITH APPLICATIONS, 2021, 3
  • [2] Bridging Optimal Transport and Jacobian Regularization by Optimal Trajectory for Enhanced Adversarial Defense
    Le, Binh M.
    Tariq, Shahroz
    Woo, Simon S.
    arXiv, 2023,
  • [3] Jacobian Regularization for Mitigating Universal Adversarial Perturbations
    Co, Kenneth T.
    Rego, David Martinez
    Lupu, Emil C.
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV, 2021, 12894 : 202 - 213
  • [4] Improving DNN Robustness to Adversarial Attacks Using Jacobian Regularization
    Jakubovitz, Daniel
    Girye, Raja
    COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 : 525 - 541
  • [5] Revisiting Gradient Regularization: Inject Robust Saliency-Aware Weight Bias for Adversarial Defense
    Li, Qian
    Hu, Qingyuan
    Lin, Chenhao
    Wu, Di
    Shen, Chao
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 5936 - 5949
  • [6] ADVERSARIAL DEFENSE VIA LOCAL FLATNESS REGULARIZATION
    Xu, Jia
    Li, Yiming
    Jiang, Yong
    Xia, Shu-Tao
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2196 - 2200
  • [7] Interpretable Adversarial Perturbation in Input Embedding Space for Text
    Sato, Motoki
    Suzuki, Jun
    Shindo, Hiroyuki
    Matsumoto, Yuji
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4323 - 4330
  • [8] A Unified Gradient Regularization Family for Adversarial Examples
    Lyu, Chunchuan
    Huang, Kaizhu
    Liang, Hai-Ning
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2015, : 301 - 309
  • [9] Adversarial Defense using Memristors and Input Preprocessing
    Paudel, Bijay Raj
    Tragoudas, Spyros
    2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
  • [10] Unifying Adversarial Training Algorithms with Data Gradient Regularization
    Ororbia, Alexander G., II
    Kifer, Daniel
    Giles, C. Lee
    NEURAL COMPUTATION, 2017, 29 (04) : 867 - 887