Mape: defending against transferable adversarial attacks using multi-source adversarial perturbations elimination

被引:0
作者
Liu, Xinlei [1 ]
Xie, Jichao [1 ]
Hu, Tao [1 ]
Yi, Peng [1 ,2 ]
Hu, Yuxiang [1 ]
Huo, Shumin [1 ]
Zhang, Zhen [1 ]
机构
[1] Informat Engn Univ, Zhengzhou 450002, Peoples R China
[2] Minist Educ, Key Lab Cyberspace Secur, Zhengzhou 450002, Peoples R China
关键词
Deep learning security; Pattern recognition; Image classification; Adversarial example; Adversarial defense;
D O I
10.1007/s40747-024-01770-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural networks are vulnerable to meticulously crafted adversarial examples, leading to high-confidence misclassifications in image classification tasks. Due to their consistency with regular input patterns and the absence of reliance on the target model and its output information, transferable adversarial attacks exhibit a notably high stealthiness and detection difficulty, making them a significant focus of defense. In this work, we propose a deep learning defense known as multi-source adversarial perturbations elimination (MAPE) to counter diverse transferable attacks. MAPE comprises the single-source adversarial perturbation elimination (SAPE) mechanism and the pre-trained models probabilistic scheduling algorithm (PPSA). SAPE utilizes a thoughtfully designed channel-attention U-Net as the defense model and employs adversarial examples generated by a pre-trained model (e.g., ResNet) for its training, thereby enabling the elimination of known adversarial perturbations. PPSA introduces model difference quantification and negative momentum to strategically schedule multiple pre-trained models, thereby maximizing the differences among adversarial examples during the defense model's training and enhancing its robustness in eliminating adversarial perturbations. MAPE effectively eliminates adversarial perturbations in various adversarial examples, providing a robust defense against attacks from different substitute models. In a black-box attack scenario utilizing ResNet-34 as the target model, our approach achieves average defense rates of over 95.1% on CIFAR-10 and over 71.5% on Mini-ImageNet, demonstrating state-of-the-art performance.
引用
收藏
页数:17
相关论文
共 58 条
  • [1] Akhtar N., 2021, arXiv
  • [2] Athalye A, 2018, PR MACH LEARN RES, V80
  • [3] Bahat Y, 2019, Arxiv, DOI arXiv:1902.00236
  • [4] Bartoldson BR, 2024, 41 INT C MACH LEARN
  • [5] Practical Black-Box Attacks on Deep Neural Networks Using Efficient Query Mechanisms
    Bhagoji, Arjun Nitin
    He, Warren
    Li, Bo
    Song, Dawn
    [J]. COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 : 158 - 174
  • [6] An Adaptive Model Ensemble Adversarial Attack for Boosting Adversarial Transferability
    Chen, Bin
    Yin, Jiali
    Chen, Shukai
    Chen, Bohao
    Liu, Ximeng
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 4466 - 4475
  • [7] Chen Y, 2017, 2017 NEURAL INFORM P, P4467
  • [8] Comprehensive comparisons of gradient-based multi-label adversarial attacks
    Chen, Zhijian
    Luo, Wenjian
    Naseem, Muhammad Luqman
    Kong, Linghao
    Yang, Xiangkai
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (05) : 6667 - 6692
  • [9] Croce F, 2020, PR MACH LEARN RES, V119
  • [10] Boosting Adversarial Attacks with Momentum
    Dong, Yinpeng
    Liao, Fangzhou
    Pang, Tianyu
    Su, Hang
    Zhu, Jun
    Hu, Xiaolin
    Li, Jianguo
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 9185 - 9193