Dual stage black-box adversarial attack against vision transformer

被引：0

作者：

Wang, Fan ^{[1
]}

Shao, Mingwen ^{[1
]}

Meng, Lingzhuang ^{[1
]}

Liu, Fukang ^{[1
]}

机构：

[1] China Univ Petr East China, Coll Comp Sci & Technol, Changjiang Rd, Qingdao 266000, Shandong, Peoples R China

来源：

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS | 2024年 / 15卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Adversarial attack; Vision transformer; Black-box attack; Transferability;

D O I：

10.1007/s13042-024-02097-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Relying on wide receptive fields, Vision Transformers (ViTs) are more robust than Convolutional Neural Networks (CNNs). Consequently, some transfer-based attack methods that perform well on CNNs perform poorly when attacking ViTs. To address the aforementioned issues, we propose dual-stage attack framework named DSA. More specifically, we introduce a dual spatial optimization strategy involving both decision space and feature space optimization to improve the transferability of adversarial examples across different ViTs. Adversarial perturbations are generated by our proposed semi self-integrated module in the first stage and optimized by the feature extractor in the second stage. During this process, our proposed integrated model makes full use of the discriminative information in the deep transformer blocks and achieves significant improvements in transferability. To further enhance the transferability, we design the random perturbation masking module to alleviate the over-fitting of adversarial examples to the surrogate model. We evaluate the transferability of attacks on state-of-the-art ViTs, CNNs, and robustly trained CNNs. Extensive experiments demonstrate that the proposed dual-stage attack can greatly boost transferability between ViTs and from ViTs to CNNs.

引用

页码：3367 / 3378

页数：12

共 48 条

[1] Understanding Robustness of Transformers for Image Classification [J].