Learning attention-guided pyramidal features for few-shot fine-grained recognition

被引:177
作者
Tang, Hao [1 ]
Yuan, Chengcheng [1 ]
Li, Zechao [1 ]
Tang, Jinhui [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
基金
中国国家自然科学基金;
关键词
Few-shot learning; Fine-grained recognition; Weakly-supervised learning; NETWORK;
D O I
10.1016/j.patcog.2022.108792
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot fine-grained recognition (FS-FGR) aims to distinguish several highly similar objects from different sub-categories with limited supervision. However, traditional few-shot learning solutions typically exploit image-level features and are committed to capturing global silhouettes while accidentally ignore to exploring local details, resulting in an inevitable problem of inconspicuous but distinguishable information loss. Thus, how to effectively address the fine-grained recognition issue given limited samples still remains a major challenging. In this article, we tend to propose an effective bidirectional pyramid architecture to enhance internal representations of features to cater to fine-grained image recognition task in the few-shot learning scenario. Specifically, we deploy a multi-scale feature pyramid and a multi-level attention pyramid on the backbone network, and progressively aggregated features from different granular spaces via both of them. We then further present an attention-guided refinement strategy in collaboration with a multi-level attention pyramid to reduce the uncertainty brought by backgrounds conditioned by limited samples. In addition, the proposed method is trained with the meta-learning framework in an end-to-end fashion without any extra supervision. Extensive experimental results on four challenging and widely-used fine-grained benchmarks show that the proposed method performs favorably against state-of-the-arts, especially in the one-shot scenarios. (c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:10
相关论文
共 45 条
[1]   Associative Alignment for Few-Shot Image Classification [J].
Afrasiyabi, Arman ;
Lalonde, Jean-Francois ;
Gagne, Christian .
COMPUTER VISION - ECCV 2020, PT V, 2020, 12350 :18-35
[2]  
[Anonymous], 2011, CVPR
[3]  
Chen W.Y., 2019, INT C LEARNING REPRE
[4]   Multi-Level Semantic Feature Augmentation for One-Shot Learning [J].
Chen, Zitian ;
Fu, Yanwei ;
Zhang, Yinda ;
Jiang, Yu-Gang ;
Xue, Xiangyang ;
Sigal, Leonid .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (09) :4594-4605
[5]   Coarse-to-fine pseudo supervision guided meta-task optimization for few-shot object classification [J].
Cui, Yawen ;
Liao, Qing ;
Hu, Dewen ;
An, Wei ;
Liu, Li .
PATTERN RECOGNITION, 2022, 122
[6]   Selective Sparse Sampling for Fine-grained Image Recognition [J].
Ding, Yao ;
Zhou, Yanzhao ;
Zhu, Yi ;
Ye, Qixiang ;
Jiao, Jianbin .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6598-6607
[7]   AP-CNN: Weakly Supervised Attention Pyramid Convolutional Neural Network for Fine-Grained Visual Classification [J].
Ding, Yifeng ;
Ma, Zhanyu ;
Wen, Shaoguo ;
Xie, Jiyang ;
Chang, Dongliang ;
Si, Zhongwei ;
Wu, Ming ;
Ling, Haibin .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :2826-2836
[8]  
Fan Zhang, 2021, MultiMedia Modeling. 27th International Conference, MMM 2021. Proceedings. Lecture Notes in Computer Science (LNCS 12572), P136, DOI 10.1007/978-3-030-67832-6_12
[9]  
Finn C, 2017, PR MACH LEARN RES, V70
[10]  
Gao Y., 2016, COMPACT BILINEAR POO