Granularity-aware distillation and structure modeling region proposal network for fine-grained image classification

被引:33
作者
Ke, Xiao [1 ,2 ]
Cai, Yuhang [1 ,2 ]
Chen, Baitao [1 ,2 ]
Liu, Hao [1 ,2 ]
Guo, Wenzhong [1 ,2 ]
机构
[1] Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350116, Peoples R China
[2] Fuzhou Univ, Key Lab Spatial Data Min & Informat Sharing, Minist Educ, Fuzhou 350116, Peoples R China
基金
中国国家自然科学基金;
关键词
Fine-grained visual classification; Multi-granularity feature learning; Knowledge distillation; Structure modeling;
D O I
10.1016/j.patcog.2023.109305
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fine-grained visual classification (FGVC) aims to identify objects belonging to multiple sub-categories of the same super-category. The key to solving fine-grained classification problems is to learn discriminative visual feature representation with only subtle differences. Although previous work based on refined fea-ture learning has made great progress, however, high-level semantic features often lack key information for fine-grained visual object nuances. How to efficiently integrate semantic information of different gran-ularities from classification networks is a critical. In this paper, we propose Granularity-aware Distillation and Structure Modeling region Proposal Network(GDSMP-Net). Our solution integrates multi-granularity hierarchical information through a multi-granularity fusion learning strategy to enhance feature repre-sentation. In view of the inherent challenges of large intra-class differences in FGVC, a cross-layer self-distillation regularization is proposed to to strengthen the connection between high-level semantics and low-level semantics for robust multi-granularity feature learning. On this basis, we use a weakly super-vised method to generate local branches, and the collaborative learning of discriminative semantics and structural semantics based on local regions, facilitating model to perceive contextual information to cap-ture structural interactions between local semantics. Comprehensive experiments show that our method achieves state-of-the-art performance on four widely-used challenging datasets.(CUB-200-2011, Stanford Cars, FGVC-Aircraft and NA-birds). (c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页数:14
相关论文
共 46 条
  • [1] Bargal SA, 2021, IEEE T PATTERN ANAL, V43, P4196, DOI [10.1109/TPAMI.2021.3054303, 10.1109/TPAMI.2020.3054303]
  • [2] SR-GNN: Spatial Relation-Aware Graph Neural Network for Fine-Grained Image Categorization
    Bera, Asish
    Wharton, Zachary
    Liu, Yonghuai
    Bessis, Nik
    Behera, Ardhendu
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6017 - 6031
  • [3] The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification
    Chang, Dongliang
    Ding, Yifeng
    Xie, Jiyang
    Bhunia, Ayan Kumar
    Li, Xiaoxu
    Ma, Zhanyu
    Wu, Ming
    Guo, Jun
    Song, Yi-Zhe
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 4683 - 4695
  • [4] Destruction and Construction Learning for Fine-grained Image Recognition
    Chen, Yue
    Bai, Yalong
    Zhang, Wei
    Mei, Tao
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5152 - 5161
  • [5] AP-CNN: Weakly Supervised Attention Pyramid Convolutional Neural Network for Fine-Grained Visual Classification
    Ding, Yifeng
    Ma, Zhanyu
    Wen, Shaoguo
    Xie, Jiyang
    Chang, Dongliang
    Si, Zhongwei
    Wu, Ming
    Ling, Haibin
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2826 - 2836
  • [6] Dubey A., 2018, ADV NEURAL INFORM PR, V31
  • [7] Gao Y, 2020, AAAI CONF ARTIF INTE, V34, P10818
  • [8] Hanselmann H, 2020, IEEE WINT CONF APPL, P1236, DOI [10.1109/wacv45572.2020.9093601, 10.1109/WACV45572.2020.9093601]
  • [9] A hierarchical sampling based triplet network for fine-grained image classification
    He, Guiqing
    Li, Feng
    Wang, Qiyao
    Bai, Zongwen
    Xu, Yuelei
    [J]. PATTERN RECOGNITION, 2021, 115
  • [10] Stochastic Partial Swap: Enhanced Model Generalization and Interpretability for Fine-grained Recognition
    Huang, Shaoli
    Wang, Xinchao
    Tao, Dacheng
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 600 - 609