Continual Learning Based on Knowledge Distillation and Representation Learning

被引：0

作者：

Chen, Xiu-Yan ^{[1
]}

Liu, Jian-Wei ^{[1
]}

Li, Wen-Tao ^{[1
]}

机构：

[1] China Univ Petr, Dept Automat, Coll Informat Sci & Engn, Beijing 102249, Peoples R China

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT IV | 2022年 / 13532卷

关键词：

Continual learning; Class incremental learning; Representation learning; Knowledge distillation; Catastrophic forgetting;

D O I：

10.1007/978-3-031-15937-4_3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, continual learning that is more in line with real-world scenarios has received more attention. In order to solve the catastrophic forgetting problem in continual learning, researchers have put forward various solutions, which are simply summarized into three types: network structure-based methods, rehearsal-based methods and regularization-based methods. Inspired by pseudo-rehearsal and regularization methods, we propose a novel Continual Learning Based on Knowledge Distillation and Representation Learning (KRCL) model, which employs Beta-VAE as a representation learning module to extract a shared representation of learned tasks. In addition, Beta-VAE is also used as a generative model to generate pseudo samples of historical task, and KRCL trains the pseudo samples of the previous tasks together with the data of the current task, and then combines the knowledge distillation process to extract the dark knowledge from the old task model to alleviate the catastrophic forgetting. We compare KRCL with the Fine-tune, LWF, IRCL and KRCL real baseline methods on four benchmark datasets. The result shows that the KRCL model achieves state-of-the-art performance in standard continual learning tasks.

引用

页码：27 / 38

页数：12

共 20 条

[1]

Rusu AA, 2016, Arxiv, DOI [arXiv:1606.04671, 10.48550/arXiv.1606.04671, DOI 10.43550/ARXIV:1606.04671, DOI 10.48550/ARXIV.1606.04671]

[2] Memory Aware Synapses: Learning What (not) to Forget [J].

Aljundi, Rahaf ;

Babiloni, Francesca ;

Elhoseiny, Mohamed ;

Rohrbach, Marcus ;

Tuytelaars, Tinne .

COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 :144-161

[3]

[Anonymous], 2009, CIFAR-100 Dataset

[4]

[Anonymous], 1998, LEARNING LEARN, DOI [10.1007/978-1-4615-5529-2, DOI 10.1007/978-1-4615-5529-2_8]

[5] A Continual Learning Survey: Defying Forgetting in Classification Tasks [J].

De Lange, Matthias ;

Aljundi, Rahaf ;

Masana, Marc ;

Parisot, Sarah ;

Jia, Xu ;

Leonardis, Ales ;

Slabaugh, Greg ;

Tuytelaars, Tinne .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) :3366-3385

[6] Rich feature hierarchies for accurate object detection and semantic segmentation [J].

Girshick, Ross ;

Donahue, Jeff ;

Darrell, Trevor ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587

[7]

Higgins I., 2017, BETA VAE LEARNING BA, V3, P1

[8]

Hinton G., 2014, Distilling the knowledge in a neural network

[9] Overcoming catastrophic forgetting in neural networks [J].

Kirkpatricka, James ;

Pascanu, Razvan ;

Rabinowitz, Neil ;

Veness, Joel ;

Desjardins, Guillaume ;

Rusu, Andrei A. ;

Milan, Kieran ;

Quan, John ;

Ramalho, Tiago ;

Grabska-Barwinska, Agnieszka ;

Hassabis, Demis ;

Clopath, Claudia ;

Kumaran, Dharshan ;

Hadsell, Raia .

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2017, 114 (13) :3521-3526

[10]

LeCun Y., 1998, The mnist database of handwritten digits

← 1 2 →