Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense

被引：12

作者：

Liu, Deyin ^{[1
]}

Wu, Lin Yuanbo ^{[2
]}

Li, Bo ^{[3
]}

Boussaid, Farid ^{[4
]}

Bennamoun, Mohammed ^{[4
]}

Xie, Xianghua ^{[2
]}

Liang, Chengwu ^{[5
]}

机构：

[1] Anhui Univ, Sch Artificial Intelligence, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Anhui, Peoples R China

[2] Swansea Univ, Dept Comp Sci, Swansea SA1 8EN, Wales

[3] Northwestern Polytech Univ, Xian 710072, Shaanxi, Peoples R China

[4] Univ Western Australia, Perth, WA 6009, Australia

[5] Henan Univ Urban Construct, Pingdingshan 467036, Henan, Peoples R China

来源：

PATTERN RECOGNITION | 2024年 / 145卷

关键词：

Selective input gradient regularization; Jacobian normalization; Adversarial robustness;

D O I：

10.1016/j.patcog.2023.109902

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural networks (DNNs) can be easily deceived by imperceptible alterations known as adversarial examples. These examples can lead to misclassification, posing a significant threat to the reliability of deep learning systems in real-world applications. Adversarial training (AT) is a popular technique used to enhance robustness by training models on a combination of corrupted and clean data. However, existing AT-based methods often struggle to handle transferred adversarial examples that can fool multiple defense models, thereby falling short of meeting the generalization requirements for real-world scenarios. Furthermore, AT typically fails to provide interpretable predictions, which are crucial for domain experts seeking to understand the behavior of DNNs. To overcome these challenges, we present a novel approach called Jacobian norm and Selective Input Gradient Regularization (J-SIGR). Our method leverages Jacobian normalization to improve robustness and introduces regularization of perturbation-based saliency maps, enabling interpretable predictions. By adopting J-SIGR, we achieve enhanced defense capabilities and promote high interpretability of DNNs. We evaluate the effectiveness of J-SIGR across various architectures by subjecting it to powerful adversarial attacks. Our experimental evaluations provide compelling evidence of the efficacy of J-SIGR against transferred adversarial attacks, while preserving interpretability. The project code can be found at https://github.com/Lywu-github/jJ-SIGR.git.

引用

页数：11

共 50 条

[31] Ensemble of Predictions from Augmented Input as Adversarial Defense for Face Verification System
Kurnianggoro, Laksono
Jo, Kang-Hyun
INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2019, PT II, 2019, 11432 : 658 - 669
[32] A PROJECTED GRADIENT ALGORITHM FOR IMAGE RESTORATION UNDER HESSIAN MATRIX-NORM REGULARIZATION
Lefkimmiatis, Stamatios
Unser, Michael
2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 3029 - 3032
[33] Selective Adversarial Adaptation Learning via Exclusive Regularization for Partial Domain Adaptation
Li, Ping
Shen, Linlin
Ling, Hefei
Wu, Lei
Wang, Qian
Zhao, Chuang
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[34] Projected Gradient Descent Adversarial Attack and Its Defense on a Fault Diagnosis System
Ayas, Mustafa Sinasi
Ayas, Selen
Djouadi, Seddik M.
2022 45TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING, TSP, 2022, : 36 - 39
[35] COMBATING FALSE SENSE OF SECURITY: BREAKING THE DEFENSE OF ADVERSARIAL TRAINING VIA NON-GRADIENT ADVERSARIAL ATTACK
Fan, Mingyuan
Liu, Yang
Chen, Cen
Yu, Shengxing
Guo, Wenzhong
Liu, Ximeng
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3293 - 3297
[36] Application of an improved LightGBM hybrid integration model combining gradient harmonization and Jacobian regularization for breast cancer diagnosis
Sun, Xiaoyan
SCIENTIFIC REPORTS, 2025, 15 (01):
[37] Application of L1-norm regularization to epicardial potential reconstruction based on gradient projection
Wang, Liansheng
Qin, Jing
Wong, Tien Tsin
Heng, Pheng Ann
PHYSICS IN MEDICINE AND BIOLOGY, 2011, 56 (19): : 6291 - 6310
[38] Seismic impedance inversion using l1-norm regularization and gradient descent methods
Wang, Yanfei
JOURNAL OF INVERSE AND ILL-POSED PROBLEMS, 2010, 18 (07): : 823 - 838
[39] DefenseFea: An Input Transformation Feature Searching Algorithm Based Latent Space for Adversarial Defense
Pan, Zhang
Cao, Yangjie
Zhu, Chenxi
Yan, Zhuang
Wang, Haobo
Jie, Li
FOUNDATIONS OF COMPUTING AND DECISION SCIENCES, 2024, 49 (01) : 21 - 36
[40] A robust defense for spiking neural networks against adversarial examples via input filtering
Guo, Shasha
Wang, Lei
Yang, Zhijie
Lu, Yuliang
JOURNAL OF SYSTEMS ARCHITECTURE, 2024, 153

← 1 2 3 4 5 →