Advances and Trends of Continual Learning

被引:0
作者
Li, Wenbin [1 ]
Xiong, Yakun [1 ]
Fan, Zhichen [1 ]
Deng, Bo [2 ]
Cao, Fuyuan [3 ,4 ]
Gao, Yang [1 ]
机构
[1] State Key Laboratory for Novel Software Technology (Nanjing University), Nanjing
[2] Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing
[3] School of Computer and Information Technology, Shanxi University, Taiyuan
[4] School of Big Data, Shanxi University, Taiyuan
来源
Jisuanji Yanjiu yu Fazhan/Computer Research and Development | 2024年 / 61卷 / 06期
基金
中国国家自然科学基金;
关键词
catastrophic forgetting; continual learning; deep learning; knowledge transfer; sequential task;
D O I
10.7544/issn1000-1239.202220820
中图分类号
学科分类号
摘要
With the development and successful application of deep learning, continual learning has attracted increasing attention and has been a hot topic in the field of machine learning, especially in the resource-limited and data-security scenarios with the increasing requirements of quickly learning sequential tasks and data. Unlike humans who enjoy the ability of continually learning and transferring knowledge, the existing deep learning models are prone to easily suffering from a catastrophic forgetting problem in a sequential learning process. Therefore, how to continually learn new knowledge and retain old knowledge at the same time on dynamic and non-stationary sequential task and streaming data, is the core of continual learning. Firstly, through the investigation and summary of the related work of continual learning at home and abroad in recent years, continual learning methods can be roughly divided into three categories: replay-based, constraint-based, and architecture-based. We further subdivide these three types of methods. Specifically, the replay-based methods are subdivided into three categories: sample replay, generation replay, and pseudo-sample replay, according to the sample’s sources used; the constraint-based methods are subdivided into parameter constraints, gradient constraints, and data constraints, according to the constraint’s sources; the architecture-based methods are subdivided into two categories: parameter isolation and model expansion, according to how the model structure is used. By comparing the innovation points of related work, the advantages and disadvantages of various methods are summarized. Secondly, the research progress at home and abroad is analyzed. Finally, the future development direction of continual learning combined with other fields is simply prospected. © 2024 Science Press. All rights reserved.
引用
收藏
页码:1476 / 1496
页数:20
相关论文
共 94 条
  • [1] He Kaiming, Zhang Xiangyu, Ren Shaoqing, Et al., Deep residual learning for image recognition [C], Proc of the IEEE Conf on Computer Vision and Pattern Recognition, pp. 770-778, (2016)
  • [2] Simonyan K, Zisserman A., Very deep convolutional networks for large-scale image recognition [J], (2015)
  • [3] Krizhevsky A, Sutskever I, Hinton G E., ImageNet classification with deep convolutional neural networks [C], Proc of the 26th Advances in Neural Information Processing Systems, pp. 1106-1114, (2012)
  • [4] Brown T B, Mann B, Ryder N, Et al., Language models are few-shot learners [C], Proc of the 34th Advances in Neural Information Processing Systems, pp. 1877-1901, (2020)
  • [5] Devlin J, Chang Mingwei, Lee K, Et al., BERT: Pre-training of deep bidirectional transformers for language understanding [C], Proc of the Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171-4186, (2019)
  • [6] Vaswani A, Shazeer N, Parmar N, Et al., Attention is all you need [C], Proc of the 31st Advances in Neural Information Processing Systems, pp. 5998-6008, (2017)
  • [7] Wu Haiping, Chen Yuntao, Wang Naiyan, Et al., Sequence level semantics aggregation for video object detection [C], Proc of the IEEE Int Conf on Computer Vision, pp. 9217-9225, (2019)
  • [8] Jiajun Deng, Yingwei Pan, Ting Yao, Et al., Single shot video object detector[J], IEEE Transactions on Multimedia, 23, pp. 846-858, (2021)
  • [9] Shvets M, Liu Wei, Berg A C., Leveraging long-range temporal relationships between proposals for video object detection [C], Proc of the IEEE Int Conf on Computer Vision, pp. 9756-9764, (2019)
  • [10] Russakovsky O, Deng Jia, Su H, Et al., Imagenet large scale visual recognition challenge[J], International Journal of Computer Vision, 115, 3, (2015)