The Research about Recurrent Model-Agnostic Meta Learning

被引：1

作者：

Chen, Shaodong ^{[1
]}

Niu, Ziyu ^{[2
]}

机构：

[1] Nanyang Inst Technol, Sch Math & Stat, Nanyang, Henan, Peoples R China

[2] Univ Edinburgh, Sch Informat, Artificial Intelligence, Edinburgh, Midlothian, Scotland

来源：

OPTICAL MEMORY AND NEURAL NETWORKS | 2020年 / 29卷 / 01期

关键词：

Model-Agnostic Meta Learning; Omniglot dataset; Convolutional Neural Network; Recurrent Neural Network; Long Short-Term Memory; Gated Recurrent Unit; n-way n-shot model;

D O I：

10.3103/S1060992X20010075

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

Although Deep Neural Networks (DNNs) have performed great success in machine learning domain, they usually show poorly on few-shot learning tasks, where a classifier has to quickly generalize after getting very few samples from each class. A Model-Agnostic Meta Learning (MAML) model, which is able to solve new learning tasks, only using a small number of training data. A MAML model with a Convolutional Neural Network (CNN) architecture is implemented as well, trained on the Omniglot dataset (rather than DNN), as a baseline for image classification tasks. However, our baseline model suffered from a long-period training process and relatively low efficiency. To address these problems, we introduced Recurrent Neural Network (RNN) architecture and its advanced variants into our MAML model, including Long Short-Term Memory (LSTM) architecture and its variants: LSTM-b and Gated Recurrent Unit (GRU). The experiment results, measured by ac- curacies, demonstrate a considerable improvement in image classification performance and training efficiency compared to the baseline models.

引用

页码：56 / 67

页数：12

共 16 条

[1]

Andrychowicz M, 2016, ADV NEUR IN, V29

[2]

[Anonymous], 2001, THESIS

[3]

[Anonymous], 2016, OPTIMIZATION MODEL F

[4]

[Anonymous], 2016, ICLR

[5] LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].

BENGIO, Y ;

SIMARD, P ;

FRASCONI, P .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166

[6]

Blog Colahs, 2015, UNDERSTANDING LSTM N

[7]

Cho K., 2014, C EMP METH NAT LANG, P1724, DOI [10.3115/v1/d14-1179, DOI 10.3115/V1/D14-1179]

[8]

Chung J., 2014, NIPS 2014 WORKSHOP D

[9]

Finn C, 2017, PR MACH LEARN RES, V70

[10] Learning to forget: Continual prediction with LSTM [J].

Gers, FA ;

Schmidhuber, J ;

Cummins, F .

NEURAL COMPUTATION, 2000, 12 (10) :2451-2471

← 1 2 →