Continuous transfer of neural network representational similarity for incremental learning

被引:43
|
作者
Tian, Songsong [1 ,2 ]
Li, Weijun [1 ,3 ,4 ]
Ning, Xin [1 ,3 ,4 ,5 ]
Ran, Hang [1 ]
Qin, Hong [1 ,3 ,4 ]
Tiwari, Prayag [6 ]
机构
[1] Chinese Acad Sci, Inst Semicond, Beijing 100083, Peoples R China
[2] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 100049, Peoples R China
[3] Univ Chinese Acad Sci, Ctr Mat Sci & Optoelect Engn, Beijing 100049, Peoples R China
[4] Univ Chinese Acad Sci, Sch Integrated Circuits, Beijing 100049, Peoples R China
[5] Zhongke Ruitu Technol Co Ltd, Beijing 100096, Peoples R China
[6] Halmstad Univ, Sch Informat Technol, S-30118 Halmstad, Sweden
关键词
Incremental learning; Pre-trained model; Knowledge distillation; Neural network representation;
D O I
10.1016/j.neucom.2023.126300
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The incremental learning paradigm in machine learning has consistently been a focus of academic research. It is similar to the way in which biological systems learn, and reduces energy consumption by avoiding excessive retraining. Existing studies utilize the powerful feature extraction capabilities of pre-trained models to address incremental learning, but there remains a problem of insufficient utiliza-tion of neural network feature knowledge. To address this issue, this paper proposes a novel method called Pre-trained Model Knowledge Distillation (PMKD) which combines knowledge distillation of neu-ral network representations and replay. This paper designs a loss function based on centered kernel align-ment to transfer neural network representations knowledge from the pre-trained model to the incremental model layer-by-layer. Additionally, the use of memory buffer for Dark Experience Replay helps the model retain past knowledge better. Experiments show that PMKD achieved superior perfor-mance on various datasets and different buffer sizes. Compared to other methods, our class incremental learning accuracy reached the best performance. The open-source code is published athttps://github.-com/TianSongS/PMKD-IL.(c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页数:11
相关论文
共 50 条
  • [1] A NEURAL-NETWORK ARCHITECTURE FOR INCREMENTAL LEARNING
    SHIOTANI, S
    FUKUDA, T
    SHIBATA, T
    NEUROCOMPUTING, 1995, 9 (02) : 111 - 130
  • [2] Progressive Convolutional Neural Network for Incremental Learning
    Siddiqui, Zahid Ali
    Park, Unsang
    ELECTRONICS, 2021, 10 (16)
  • [3] On Temporal Summation in Chaotic Neural Network with Incremental Learning
    Deguchi, Toshinori
    Takahashi, Toshiki
    Ishii, Naohiro
    INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2014, 2 (04) : 72 - 84
  • [4] On Capacity with Incremental Learning by Simplified Chaotic Neural Network
    Deguchi, Toshinori
    Ishii, Naohiro
    THEORY AND PRACTICE OF NATURAL COMPUTING (TPNC 2018), 2018, 11324 : 377 - 387
  • [5] Feasibility Study of Incremental Neural Network Based Test Escape Detection by Introducing Transfer Learning Technique
    Takaya, Ayano
    Shintani, Michihiro
    2023 IEEE INTERNATIONAL TEST CONFERENCE IN ASIA, ITC-ASIA, 2023,
  • [6] A self-organizing incremental neural network for imbalance learning
    Yue Shao
    Baile Xu
    Furao Shen
    Jian Zhao
    Neural Computing and Applications, 2023, 35 : 9789 - 9802
  • [7] An incremental learning preprocessor for feed-forward neural network
    Piyabute Fuangkhon
    Artificial Intelligence Review, 2014, 41 : 183 - 210
  • [8] A new ARTMAP-based neural network for incremental learning
    Su, Mu-Chun
    Lee, Jonathan
    Hsieh, Kuo-Lung
    NEUROCOMPUTING, 2006, 69 (16-18) : 2284 - 2300
  • [10] On Pattern Generating Methods for Incremental Learning by Chaotic Neural Network
    Deguchi, Toshinori
    Ishii, Naohiro
    2016 4TH INTL CONF ON APPLIED COMPUTING AND INFORMATION TECHNOLOGY/3RD INTL CONF ON COMPUTATIONAL SCIENCE/INTELLIGENCE AND APPLIED INFORMATICS/1ST INTL CONF ON BIG DATA, CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (ACIT-CSII-BCD), 2016, : 271 - 276