Multi-scale kronecker-product relation networks for few-shot learning

被引:16
作者
Abdelaziz, Mounir [1 ]
Zhang, Zuping [1 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, 932 South Lushan Rd, Changsha 410083, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Few-shot learning; Multi-scale feature; Position-aware feature; Kronecker-product; Relation networks; Object recognition;
D O I
10.1007/s11042-021-11735-w
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Few-shot learning aims to train classifiers to learn new visual object categories from few training examples. Recently, metric-learning based methods have made promising progress. Relation Network is a metric-based method that uses simple convolutional neural networks to learn deep relationships between image features in order to recognize new objects. However, during the feature comparing phase, Relation Network is considered sensitive to the spatial positions of the compared objects. Moreover, it learns from only single-scale features which can lead to a poor generalization ability due to scale variation of the compared objects. To solve these problems, we intend to extend Relation Network to be position-aware and integrate multi-scale features for more robust metric learning and better generalization ability. In this paper, we propose a novel few-shot learning method called Multi-scale Kronecker-Product Relation Networks For Few-Shot Learning (MsK-PRN). Our method combines feature maps with spatial correlation maps generated from a Kronecker-product module to capture position-wise correlations between the compared features and then feeds them to a relation network module, which captures similarities between the combined features in a multi-scale manner. Extensive experiments demonstrate that the proposed method outperforms the related state-of-the-art methods on popular few-shot learning datasets. Particularly, MsKPRN has improved the accuracy of Relation Network from 50.44 to 57.02 and from 65.63 to 72.06 on 5-way 1-shot and 5-shot scenarios, respectively. Our code will be available on: https://github.com/mouniraziz/MsKPRN.
引用
收藏
页码:6703 / 6722
页数:20
相关论文
共 60 条
  • [1] Few-shot learning with saliency maps as additional visual information
    Abdelaziz, Mounir
    Zhang, Zuping
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (07) : 10491 - 10508
  • [2] [Anonymous], 2010, Caltech-ucsd birds 200
  • [3] Learning to Forget for Meta-Learning
    Baik, Sungyong
    Hong, Seokil
    Lee, Kyoung Mu
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2376 - 2384
  • [4] RECOGNITION-BY-COMPONENTS - A THEORY OF HUMAN IMAGE UNDERSTANDING
    BIEDERMAN, I
    [J]. PSYCHOLOGICAL REVIEW, 1987, 94 (02) : 115 - 147
  • [5] Memory Matching Networks for One-Shot Image Recognition
    Cai, Qi
    Pan, Yingwei
    Yao, Ting
    Yan, Chenggang
    Mei, Tao
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4080 - 4088
  • [6] Chen H, 2020, ARXIV201114479
  • [7] Image Deformation Meta-Networks for One-Shot Learning
    Chen, Zitian
    Fu, Yanwei
    Wang, Yu-Xiong
    Ma, Lin
    Liu, Wei
    Hebert, Martial
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 8672 - 8681
  • [8] Multi-Level Semantic Feature Augmentation for One-Shot Learning
    Chen, Zitian
    Fu, Yanwei
    Zhang, Yinda
    Jiang, Yu-Gang
    Xue, Xiangyang
    Sigal, Leonid
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (09) : 4594 - 4605
  • [9] Spot and Learn: A Maximum-Entropy Patch Sampler for Few-Shot Image Classification
    Chu, Wen-Hsuan
    Li, Yu-Jhe
    Chang, Jing-Cheng
    Wang, Yu-Chiang Frank
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6244 - 6253
  • [10] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171