Multi-scale kronecker-product relation networks for few-shot learning

被引：21

作者：

Abdelaziz, Mounir ^{[1
]}

Zhang, Zuping ^{[1
]}

机构：

[1] Cent South Univ, Sch Comp Sci & Engn, 932 South Lushan Rd, Changsha 410083, Hunan, Peoples R China

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2022年 / 81卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Few-shot learning; Multi-scale feature; Position-aware feature; Kronecker-product; Relation networks; Object recognition;

D O I：

10.1007/s11042-021-11735-w

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Few-shot learning aims to train classifiers to learn new visual object categories from few training examples. Recently, metric-learning based methods have made promising progress. Relation Network is a metric-based method that uses simple convolutional neural networks to learn deep relationships between image features in order to recognize new objects. However, during the feature comparing phase, Relation Network is considered sensitive to the spatial positions of the compared objects. Moreover, it learns from only single-scale features which can lead to a poor generalization ability due to scale variation of the compared objects. To solve these problems, we intend to extend Relation Network to be position-aware and integrate multi-scale features for more robust metric learning and better generalization ability. In this paper, we propose a novel few-shot learning method called Multi-scale Kronecker-Product Relation Networks For Few-Shot Learning (MsK-PRN). Our method combines feature maps with spatial correlation maps generated from a Kronecker-product module to capture position-wise correlations between the compared features and then feeds them to a relation network module, which captures similarities between the combined features in a multi-scale manner. Extensive experiments demonstrate that the proposed method outperforms the related state-of-the-art methods on popular few-shot learning datasets. Particularly, MsKPRN has improved the accuracy of Relation Network from 50.44 to 57.02 and from 65.63 to 72.06 on 5-way 1-shot and 5-shot scenarios, respectively. Our code will be available on: https://github.com/mouniraziz/MsKPRN.

引用

页码：6703 / 6722

页数：20

共 60 条

[1] Few-shot learning with saliency maps as additional visual information [J].

Abdelaziz, Mounir ;

Zhang, Zuping .

MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (07) :10491-10508

[2] Learning to Forget for Meta-Learning [J].

Baik, Sungyong ;

Hong, Seokil ;

Lee, Kyoung Mu .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2376-2384

[3] RECOGNITION-BY-COMPONENTS - A THEORY OF HUMAN IMAGE UNDERSTANDING [J].

BIEDERMAN, I .

PSYCHOLOGICAL REVIEW, 1987, 94 (02) :115-147

[4] Memory Matching Networks for One-Shot Image Recognition [J].

Cai, Qi ;

Pan, Yingwei ;

Yao, Ting ;

Yan, Chenggang ;

Mei, Tao .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4080-4088

[5]

Chen H, 2020, ARXIV201114479

[6] Image Deformation Meta-Networks for One-Shot Learning [J].

Chen, Zitian ;

Fu, Yanwei ;

Wang, Yu-Xiong ;

Ma, Lin ;

Liu, Wei ;

Hebert, Martial .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8672-8681

[7] Multi-Level Semantic Feature Augmentation for One-Shot Learning [J].

Chen, Zitian ;

Fu, Yanwei ;

Zhang, Yinda ;

Jiang, Yu-Gang ;

Xue, Xiangyang ;

Sigal, Leonid .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (09) :4594-4605

[8] Spot and Learn: A Maximum-Entropy Patch Sampler for Few-Shot Image Classification [J].

Chu, Wen-Hsuan ;

Li, Yu-Jhe ;

Chang, Jing-Cheng ;

Wang, Yu-Chiang Frank .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6244-6253

[9]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[10]

Finn C, 2017, PR MACH LEARN RES, V70

← 1 2 3 4 5 6 →