Adversarially attack feature similarity for fine-grained visual classification

被引：4

作者：

Wang, Yupeng ^{[1
]}

Xu, Can ^{[1
]}

Wang, Yongli ^{[1
]}

Wang, Xiaoli ^{[1
]}

Ding, Weiping ^{[2
]}

机构：

[1] Nanjing Univ Sci & Technol, Coll Comp Sci & Engn, Nanjing 210094, Peoples R China

[2] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Peoples R China

来源：

APPLIED SOFT COMPUTING | 2024年 / 163卷

基金：

中国国家自然科学基金;

关键词：

Adversarial learning; Adversarial attack; Fine-grained visual classification; Attention mechanism; Causal inference;

D O I：

10.1016/j.asoc.2024.111945

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Fine-grained visual classification (FGVC) strives to distinguish images from distinct sub-classes within the same overarching meta-class, which is significant in various practical applications. Existing works mainly employ attention mechanisms to learn discriminative feature representations of objects under weakly supervised learning. In this paper, we argue that this likehood-based attention learning manner often outputs an inadequate feature representation since the available image-level labels fail to provide an explicit supervisory signal for attention learning, especially when the fine-grained images share a small and inconsistent inter-class variance. To alleviate this issue, we consider this challenging task from the perspective of attacking the feature representation between similar sub-classes to maximize the feature discriminativeness via learning adversarial examples, and propose an Adversarial-Aware Fine-Grained Visual Classification Network (A(2)Net). Specifically, we first propose an adversarial attack module based on projected gradient descent, which appends multiple-scale adversarial perturbations to simulate subclass examples with different similarities. Then, we introduce an adversarial attention generation module that estimates the effect of attention learned on adversarial examples and legitimate examples for the final class prediction through causal inference. The adversarial attention generation module is encouraged to maximize the effect, which provides powerful supervision to capture more attention indicating the discriminative parts. We further propose an adversarial-aware module to learn the feature-level differences between legitimate and adversarial examples, which helps enhance the semantic boundaries of class-specific features for accurate FGVC. The extensive array of experiments conducted serves to underscore the efficacy of the proposed A(2)Net, outperforming state-of-the-art FGVC methods on CUB-200-2011, FGVC-Aircraft, Stanford Cars, Stanford Dogs, and NABirds benchmarks.

引用

页数：13

共 73 条

[1] Multi-scale network via progressive multi-granularity attention for fine-grained visual classification [J].

An, Chen ;

Wang, Xiaodong ;

Wei, Zhiqiang ;

Zhang, Ke ;

Huang, Lei .

APPLIED SOFT COMPUTING, 2023, 146

[2]

Athalye A, 2018, PR MACH LEARN RES, V80

[3]

Athalye A, 2018, PR MACH LEARN RES, V80

[4]

Behera A, 2021, AAAI CONF ARTIF INTE, V35, P929

[5] Towards Evaluating the Robustness of Neural Networks [J].

Carlini, Nicholas ;

Wagner, David .

2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, :39-57

[6] Your "Flamingo" is My "Bird": Fine-Grained, or Not [J].

Chang, Dongliang ;

Pang, Kaiyue ;

Zheng, Yixiao ;

Ma, Zhanyu ;

Song, Yi-Zhe ;

Guo, Jun .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :11471-11480

[7] Attention-based cropping and erasing learning with coarse-to-fine refinement for fine-grained visual classification [J].

Chen, Jianpin ;

Li, Heng ;

Liang, Junlin ;

Su, Xiaofan ;

Zhai, Zhenzhen ;

Chai, Xinyu .

NEUROCOMPUTING, 2022, 501 :359-369

[8]

Chen Pin-Yu, 2017, ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models

[9] Destruction and Construction Learning for Fine-grained Image Recognition [J].

Chen, Yue ;

Bai, Yalong ;

Zhang, Wei ;

Mei, Tao .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5152-5161

[10]

Cisse M, 2017, PR MACH LEARN RES, V70

← 1 2 3 4 5 6 7 8 →