Learning more discriminative clues with gradual attention for fine-grained visual categorization

被引:1
|
作者
Xu, Qin [1 ,2 ]
Zhang, Mengquan [1 ,2 ]
Li, Yun [1 ,2 ]
Tao, Zhifu [3 ]
机构
[1] Anhui Univ, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Peoples R China
[2] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Peoples R China
[3] Anhui Univ, Sch Big Data & Stat, Hefei 230601, Peoples R China
基金
中国国家自然科学基金;
关键词
Fine-grained visual categorization; Convolutional neural network; Visual attention; Self -calibrated convolution; IMAGE CLASSIFICATION; NETWORK; MODEL; CNN;
D O I
10.1016/j.imavis.2023.104753
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fine-grained visual categorization, which aims to identify the different subcategories of images within the same category, is a very challenging task due to the large intra-class differences and subtle inter-class variances. The existing methods mostly focus on the salient local regions and ignore other features which probably help to recognize the images more precisely. To address this issue, in this paper, we propose a novel end-to-end network composed of the self-calibrated convolution, gradual attention module and feature inverse module for fine-grained visual categorization. To extract the salient features, the self-calibrated convolution is exploited which can avoid the influence of irrelevant information and locate salient regions more accurately. In aiming to extract the discriminative features, we propose the gradual attention module which consists of alternate channel-spatial attention and hierarchical feature grouping. The gradual attention module can extract the subtle discriminative features gradually even when the semantic information of shallow stages is not rich. Moreover, we design the feature inverse module which forces the next stage of network to search for other different useful features by feature inverse. The gradual attention module combined with the feature inverse module is capable of finding more detailed regions and of benefit to improving classification performance. Finally, the stage features and fused features are jointly used for classification. The proposed method is evaluated on three classical fine-grained image datasets and compared with a number of state-of-the-art methods. Our method achieves 89.5%, 95.2% and 93.9% accuracies on CUB-200-2011, Stanford Cars and FGVC-Aircraft datasets respectively. The experimental results demonstrate the effectiveness and superiority of the proposed method.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Discriminative Suprasphere Embedding for Fine-Grained Visual Categorization
    Ye, Shuo
    Peng, Qinmu
    Sun, Wenju
    Xu, Jiamiao
    Wang, Yu
    You, Xinge
    Cheung, Yiu-Ming
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 5092 - 5102
  • [2] Multiresolution Discriminative Mixup Network for Fine-Grained Visual Categorization
    Xu, Kunran
    Lai, Rui
    Gu, Lin
    Li, Yishi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (07) : 3488 - 3500
  • [3] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification
    Rao, Yongming
    Chen, Guangyi
    Lu, Jiwen
    Zhou, Jie
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1005 - 1014
  • [4] DSP: Discriminative Spatial Part modeling for Fine-Grained Visual Categorization
    Yao, Hantao
    Zhang, Dongming
    Li, Jintao
    Zhou, Jianshe
    Zhang, Shiliang
    Zhang, Yongdong
    IMAGE AND VISION COMPUTING, 2017, 63 : 24 - 37
  • [5] Category attention transfer for efficient fine-grained visual categorization
    Liao, Qiyu
    Wang, Dadong
    Xu, Min
    PATTERN RECOGNITION LETTERS, 2022, 153 : 10 - 15
  • [6] Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization
    Liu, Chuanbin
    Xie, Hongtao
    Zha, Zheng-Jun
    Ma, Lingfeng
    Yu, Lingyun
    Zhang, Yongdong
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11555 - 11562
  • [7] Multiscale attention dynamic aware network for fine-grained visual categorization
    Ou, Jichu
    Li, Wanyi
    Huang, Jingmin
    Huang, Xiaojie
    Xie, Xuan
    ELECTRONICS LETTERS, 2023, 59 (01)
  • [8] Multistage attention region supplement transformer for fine-grained visual categorization
    Mei, Aokun
    Huo, Hua
    Xu, Jiaxin
    Xu, Ningya
    VISUAL COMPUTER, 2025, 41 (03): : 1873 - 1889
  • [9] Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization
    Ji, Ruyi
    Wen, Longyin
    Zhang, Libo
    Du, Dawei
    Wu, Yanjun
    Zhao, Chen
    Liu, Xianglong
    Huang, Feiyue
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 10465 - 10474
  • [10] Cross-X Learning for Fine-Grained Visual Categorization
    Luo, Wei
    Yang, Xitong
    Mo, Xianjie
    Lu, Yuheng
    Davis, Larry S.
    Li, Jun
    Yang, Jian
    Lim, Ser-Nam
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8241 - 8250