Dual stage black-box adversarial attack against vision transformer

被引:0
作者
Wang, Fan [1 ]
Shao, Mingwen [1 ]
Meng, Lingzhuang [1 ]
Liu, Fukang [1 ]
机构
[1] China Univ Petr East China, Coll Comp Sci & Technol, Changjiang Rd, Qingdao 266000, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
Adversarial attack; Vision transformer; Black-box attack; Transferability;
D O I
10.1007/s13042-024-02097-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Relying on wide receptive fields, Vision Transformers (ViTs) are more robust than Convolutional Neural Networks (CNNs). Consequently, some transfer-based attack methods that perform well on CNNs perform poorly when attacking ViTs. To address the aforementioned issues, we propose dual-stage attack framework named DSA. More specifically, we introduce a dual spatial optimization strategy involving both decision space and feature space optimization to improve the transferability of adversarial examples across different ViTs. Adversarial perturbations are generated by our proposed semi self-integrated module in the first stage and optimized by the feature extractor in the second stage. During this process, our proposed integrated model makes full use of the discriminative information in the deep transformer blocks and achieves significant improvements in transferability. To further enhance the transferability, we design the random perturbation masking module to alleviate the over-fitting of adversarial examples to the surrogate model. We evaluate the transferability of attacks on state-of-the-art ViTs, CNNs, and robustly trained CNNs. Extensive experiments demonstrate that the proposed dual-stage attack can greatly boost transferability between ViTs and from ViTs to CNNs.
引用
收藏
页码:3367 / 3378
页数:12
相关论文
共 48 条
[1]   Understanding Robustness of Transformers for Image Classification [J].
Bhojanapalli, Srinadh ;
Chakrabarti, Ayan ;
Glasner, Daniel ;
Li, Daliang ;
Unterthiner, Thomas ;
Veit, Andreas .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :10211-10221
[2]  
Brendel W, 2018, Arxiv, DOI arXiv:1712.04248
[3]   Universal Adversarial Attack on Attention and the Resulting Dataset DAmageNet [J].
Chen, Sizhe ;
He, Zhengbao ;
Sun, Chengjin ;
Yang, Jie ;
Huang, Xiaolin .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (04) :2188-2197
[4]   Visformer: The Vision-friendly Transformer [J].
Chen, Zhengsu ;
Xie, Lingxi ;
Niu, Jianwei ;
Liu, Xuefeng ;
Wei, Longhui ;
Tian, Qi .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :569-578
[5]   ConViT: improving vision transformers with soft convolutional inductive biases [J].
d'Ascoli, Stephane ;
Touvron, Hugo ;
Leavitt, Matthew L. ;
Morcos, Ari S. ;
Biroli, Giulio ;
Sagun, Levent .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2022, 2022 (11)
[6]   Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks [J].
Dong, Yinpeng ;
Pang, Tianyu ;
Su, Hang ;
Zhu, Jun .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4307-4316
[7]   Boosting Adversarial Attacks with Momentum [J].
Dong, Yinpeng ;
Liao, Fangzhou ;
Pang, Tianyu ;
Su, Hang ;
Zhu, Jun ;
Hu, Xiaolin ;
Li, Jianguo .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :9185-9193
[8]  
Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, 10.48550/arXiv.2010.11929, DOI 10.48550/ARXIV.2010.11929]
[9]   A multi-objective mutation-based dynamic Harris Hawks optimization for botnet detection in IoT [J].
Gharehchopogh, Farhad Soleimanian ;
Abdollahzadeh, Benyamin ;
Barshandeh, Saeid ;
Arasteh, Bahman .
INTERNET OF THINGS, 2023, 24
[10]   An improved African vultures optimization algorithm using different fitness functions for multi-level thresholding image segmentation [J].
Gharehchopogh, Farhad Soleimanian ;
Ibrikci, Turgay .
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (06) :16929-16975