Continuous transfer of neural network representational similarity for incremental learning

被引：48

作者：

Tian, Songsong ^{[1
,2
]}

Li, Weijun ^{[1
,3
,4
]}

Ning, Xin ^{[1
,3
,4
,5
]}

Ran, Hang ^{[1
]}

Qin, Hong ^{[1
,3
,4
]}

Tiwari, Prayag ^{[6
]}

机构：

[1] Chinese Acad Sci, Inst Semicond, Beijing 100083, Peoples R China

[2] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 100049, Peoples R China

[3] Univ Chinese Acad Sci, Ctr Mat Sci & Optoelect Engn, Beijing 100049, Peoples R China

[4] Univ Chinese Acad Sci, Sch Integrated Circuits, Beijing 100049, Peoples R China

[5] Zhongke Ruitu Technol Co Ltd, Beijing 100096, Peoples R China

[6] Halmstad Univ, Sch Informat Technol, S-30118 Halmstad, Sweden

来源：

NEUROCOMPUTING | 2023年 / 545卷

关键词：

Incremental learning; Pre-trained model; Knowledge distillation; Neural network representation;

D O I：

10.1016/j.neucom.2023.126300

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The incremental learning paradigm in machine learning has consistently been a focus of academic research. It is similar to the way in which biological systems learn, and reduces energy consumption by avoiding excessive retraining. Existing studies utilize the powerful feature extraction capabilities of pre-trained models to address incremental learning, but there remains a problem of insufficient utiliza-tion of neural network feature knowledge. To address this issue, this paper proposes a novel method called Pre-trained Model Knowledge Distillation (PMKD) which combines knowledge distillation of neu-ral network representations and replay. This paper designs a loss function based on centered kernel align-ment to transfer neural network representations knowledge from the pre-trained model to the incremental model layer-by-layer. Additionally, the use of memory buffer for Dark Experience Replay helps the model retain past knowledge better. Experiments show that PMKD achieved superior perfor-mance on various datasets and different buffer sizes. Compared to other methods, our class incremental learning accuracy reached the best performance. The open-source code is published athttps://github.-com/TianSongS/PMKD-IL.(c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

引用

页数：11

共 60 条

[1]

Rusu AA, 2016, Arxiv, DOI [arXiv:1606.04671, DOI 10.43550/ARXIV:1606.04671, DOI 10.48550/ARXIV.1606.04671]

[2]

Benjamin A. S., 2019, INT C LEARNING REPRE

[3]

Boschini M., 2022, arXiv

[4] Class-Incremental Continual Learning Into the eXtended DER-Verse [J].

Boschini, Matteo ;

Bonicelli, Lorenzo ;

Buzzega, Pietro ;

Porrello, Angelo ;

Calderara, Simone .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) :5497-5512

[5]

Buzzega P., 2020, Advances in neural information processing systems, V33, P15920

[6] End-to-End Incremental Learning [J].

Castro, Francisco M. ;

Marin-Jimenez, Manuel J. ;

Guil, Nicolas ;

Schmid, Cordelia ;

Alahari, Karteek .

COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 :241-257

[7]

Cha Hyuntak, 2021, P IEEE CVF INT C COM

[8]

Chaudhry A., 2019, P INT C LEARN REPR

[9] Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence [J].

Chaudhry, Arslan ;

Dokania, Puneet K. ;

Ajanthan, Thalaiyasingam ;

Torr, Philip H. S. .

COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 :556-572

[10]

Chaudhry Arslan, 2019, arXiv

← 1 2 3 4 5 6 →