Few-shot image recognition based on multi-scale features prototypical network

被引:0
作者
Liu, Jiatong [1 ]
Duan, Yong [1 ]
机构
[1] School of Information Science Engineering, Shenyang University of Technology, Shenyang 110870, P. R. China Shenyang Key Laboratory of Advanced Computing and Application Innovation
关键词
channel attention; few-shot learning; label-smoothing; multi-scale feature; prototypical network;
D O I
10.3772/j.issn.1006-6748.2024.03.007
中图分类号
学科分类号
摘要
In order to improve the model’s capability in expressing features during few-shot learning, a multi-scale features prototypical network (MS-PN) algorithm is proposed. The metric learning algorithm is employed to extract image features and project them into a feature space, thus evaluating the similarity between samples based on their relative distances within the metric space. To sufficiently extract feature information from limited sample data and mitigate the impact of constrained data volume, a multi-scale feature extraction network is presented to capture data features at various scales during the process of image feature extraction. Additionally, the position of the prototype is fine-tuned by assigning weights to data points to mitigate the influence of outliers on the experiment. The loss function integrates contrastive loss and label-smoothing to bring similar data points closer and separate dissimilar data points within the metric space. Experimental evaluations are conducted on small-sample datasets mini-ImageNet and CUB200-2011. The method in this paper can achieve higher classification accuracy. Specifically, in the 5-way 1-shot experiment, classification accuracy reaches 50. 13% and 66. 79% respectively on these two datasets. Moreover, in the 5-way 5-shot experiment, accuracy of 66. 79% and 85. 91% are observed, respectively. © 2024 Inst. of Scientific and Technical Information of China. All rights reserved.
引用
收藏
页码:280 / 289
页数:9
相关论文
共 24 条
  • [1] LI Y., Research and application of deep learning in image recognition [ C ], International Conference on Power, Electronics and Computer Applications, pp. 994-999, (2022)
  • [2] JI C Q, GAO Z Y, QIN J, Et al., Overview of image classification algorithms based on convolutional neural network, Computer Application, 42, 4, pp. 1044-1049, (2022)
  • [3] LIU Z, NING J, CAO Y, Et al., Video swin transformer, Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition, pp. 3202-3211, (2022)
  • [4] GUO M H, XU T X, LIU J J, Et al., Attention mechanisms in computer vision: a survey, Computational Visual Media, 8, 3, pp. 331-368, (2022)
  • [5] LI W G, GAN P, XIE L, Et al., A small sample image classification method based on sample pair meta learning, Acta Electronica Sinica, 50, 2, pp. 295-304, (2022)
  • [6] ZHANG J, ZHANG X, LV L, Et al., An applicative survey on few-shot learning, Recent Patents on Engineering, 16, 5, pp. 104-124, (2022)
  • [7] LIU Y, LEI Y B, FAN J L, Et al., Overview of image classification techniques based on small sample learning, Journal of Automation, 47, 2, pp. 297-315, (2021)
  • [8] ZHOU Z, QIU X, XIE J, Et al., Binocular mutual learning for improving few-shot classification, Proceedings of the IEEE/ CVF International Conference on Computer Vision, pp. 8402-8411, (2021)
  • [9] KANG D, KWON H, MIN J, Et al., Relational embedding for few-shot classification, Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8822-8833, (2021)
  • [10] QI G, YU H, LU Z, Et al., Transductive few-shot classification on the oblique manifold, Proceedings of the IEEE/ CVF International Conference on Computer Vision, pp. 8412-8422, (2021)