Bidirectional Patch-Aware Attention Network for Few-Shot Learning

被引:0
作者
Mao, Yu [1 ,2 ]
Lin, Shaojie [1 ,3 ]
Lin, Zilong [1 ,3 ]
Lin, Yaojin [1 ,3 ]
机构
[1] Minnan Normal Univ, Sch Comp Sci, Zhangzhou 363000, Fujian, Peoples R China
[2] Minnan Normal Univ, Key Lab Data Sci & Intelligence Applicat, Zhangzhou 363000, Fujian, Peoples R China
[3] Minnan Normal Univ, Key Lab Data Sci & Intelligence Applicat, Zhangzhou 363000, Fujian, Peoples R China
来源
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS | 2025年
基金
中国国家自然科学基金;
关键词
Feature extraction; Few shot learning; Image reconstruction; Attention mechanisms; Vectors; Semantics; Computational modeling; Training; Prototypes; Dogs; Attention mechanism; bidirectional patchaware; feature aggregation; few-shot learning;
D O I
10.1109/TCSS.2025.3548057
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Few-shot learning (FSL) aims to train a model using a minimal number of samples and subsequently apply this model to recognize unseen classes. Recently, metric-based methods mainly focus on exploring the relationship between the support set and the query set through attention mechanism in solving FSL problems. However, these methods typically employ unidirectional computation when calculating the attention relationship between support and query. This unidirectional approach not only limits the depth and breadth of knowledge acquisition but may also lead to mismatched patches between support and query, thereby affecting the overall performance of the model. In this article, we propose a bidirectional patch-aware attention network for few-shot learning (BPAN) to address this issue. First, we extract subimages via grid cropping and feed them into the learned feature extractor to obtain patch features. Moreover, self-attention is used to assign different weights to patch features and reconstruct them. Then, PFCAM is proposed to mutually explore the patch feature relationship between the support set and the support set, further reconstruct the patch features, and aggregate multiple patch features of each image into one feature through a learnable parameter matrix for the purpose of prediction. Finally, the template for each class is constructed to extend the results of PFCAM to the few-shot classification scenario. Experiments on three benchmark datasets show that BPAN achieves superior performance.
引用
收藏
页数:11
相关论文
共 59 条
[1]  
Rusu AA, 2019, Arxiv, DOI arXiv:1807.05960
[2]  
Bin Liu, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12349), P438, DOI 10.1007/978-3-030-58548-8_26
[3]   Few-shot learning with unsupervised part discovery and part-aligned similarity [J].
Chen, Wentao ;
Zhang, Zhang ;
Wang, Wei ;
Wang, Liang ;
Wang, Zilei ;
Tan, Tieniu .
PATTERN RECOGNITION, 2023, 133
[4]   Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning [J].
Chen, Yinbo ;
Liu, Zhuang ;
Xu, Huijuan ;
Darrell, Trevor ;
Wang, Xiaolong .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9042-9051
[5]   Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning [J].
Cheraghian, Ali ;
Rahman, Shafin ;
Fang, Pengfei ;
Roy, Soumava Kumar ;
Petersson, Lars ;
Harandi, Mehrtash .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :2534-2543
[6]  
Doersch C., 2020, Advances in Neural Information Processing Systems, P21981, DOI DOI 10.5555/3495724.3497568
[7]  
Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[8]  
Finn C, 2017, PR MACH LEARN RES, V70
[9]  
Guo YL, 2020, PROC CVPR IEEE, P13496, DOI 10.1109/CVPR42600.2020.01351
[10]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778