A metric-based meta-learning approach combined attention mechanism and ensemble learning for few-shot learning

被引：21

作者：

Guo, Nan ^{[1
,2
,3
,4
,5
]}

Di, Kexin ^{[1
]}

Liu, Hongyan ^{[1
,2
,3
,4
,5
]}

Wang, Yifei ^{[1
]}

Qiao, Junfei ^{[1
,2
,3
,4
,5
]}

机构：

[1] Beijing Univ Technol, Fac Informat Technol, Beijing, Peoples R China

[2] Minist Educ, Engn Res Ctr Intelligence Percept & Autonomous Co, Beijing, Peoples R China

[3] Beijing Lab Smart Environm Protect, Beijing, Peoples R China

[4] Beijing Key Lab Computat Intelligence & Intellige, Beijing, Peoples R China

[5] Beijing Artificial Intelligence Inst, Beijing, Peoples R China

来源：

DISPLAYS | 2021年 / 70卷

基金：

中国国家自然科学基金;

关键词：

Meta-learning; Ensemble learning; Metric-learning; Attention module; Few-shot learning; NETWORK;

D O I：

10.1016/j.displa.2021.102065

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Meta-learning is one of the latest research directions in machine learning, which is considered to be one of the most probably ways to realize strong artificial intelligence. Meta-learning focuses on seeking solutions for machines to learn like human beings do - to recognize things through only few sample data and quickly adapt to new tasks. Challenges occur in how to train an efficient machine model with limited labeled data, since the model is easily over-fitted. In this paper, we address this obvious but important problem and propose a metric-based metalearning model, which combines attention mechanisms and ensemble learning method. In our model, we first design a dual path attention module which considers both channel attention and spatial attention module, and the attention modules have been stacked to conduct a meta-learner for few shot meta-learning. Then, we apply an ensemble method called snap-shot ensemble to the attention-based meta-learner in order to generate more models in a single episode. Features abstracted from the models are put into the metric-based architecture to compute a prototype for each class. Our proposed method intensifies the feature extracting ability of backbone network in meta-learner and reduces over-fitting through ensemble learning and metric learning method. Experimental results toward several meta-learning datasets show that our approach is effective.

引用

页数：8

共 44 条

[1] Long short-term memory [J].

Hochreiter, S ;

Schmidhuber, J .

NEURAL COMPUTATION, 1997, 9 (08) :1735-1780

[2] Multi attention module for visual tracking [J].

Chen, Boyu ;

Li, Peixia ;

Sun, Chong ;

Wang, Dong ;

Yang, Gang ;

Lu, Huchuan .

PATTERN RECOGNITION, 2019, 87 :80-93

[3] Xception: Deep Learning with Depthwise Separable Convolutions [J].

Chollet, Francois .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807

[4]

Finn C, 2017, PR MACH LEARN RES, V70

[5] Introduction to the special issue on meta-learning [J].

Giraud-Carrier, C ;

Vilalta, R ;

Brazdil, P .

MACHINE LEARNING, 2004, 54 (03) :187-193

[6] Ensemble Meta-Learning for Few-Shot Soot Density Recognition [J].

Gu, Ke ;

Zhang, Yonghui ;

Qiao, Junfei .

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (03) :2261-2270

[7] No-Reference Quality Assessment of Screen Content Pictures [J].

Gu, Ke ;

Zhou, Jun ;

Qiao, Jun-Fei ;

Zhai, Guangtao ;

Lin, Weisi ;

Bovik, Alan Conrad .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (08) :4005-4018

[8] Improved deep CNNs based on Nonlinear Hybrid Attention Module for image classification [J].

Guo, Nan ;

Gu, Ke ;

Qiao, Junfei ;

Bi, Jing .

NEURAL NETWORKS, 2021, 140 :158-166

[9] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[10]

Hou RB, 2019, ADV NEUR IN, V32

← 1 2 3 4 5 →