Recent Advances of Foundation Language Models-based Continual Learning: A Survey

被引:4
作者
Yang, Yutao [1 ]
Zhou, Jie [1 ]
Ding, Xuan wen [1 ]
Huai, Tianyu [1 ]
Liu, Shunyu [1 ]
Chen, Qin [1 ]
Xie, Yuan [1 ]
He, Liang [1 ]
机构
[1] East China Normal Univ, Sch Comp Sci & Technol, Shanghai, Peoples R China
关键词
Continual learning; foundation language models; pre-trained language models; large language models; vision-language models; survey; NEURAL-NETWORKS; LIFELONG;
D O I
10.1145/3705725
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recently, foundation language models (LMs) have marked significant achievements in the domains of natural language processing and computer vision. Unlike traditional neural network models, foundation LMs obtain a great ability for transfer learning by acquiring rich common sense knowledge through pre-training on extensive unsupervised datasets with a vast number of parameters. Despite these capabilities, LMs still struggle with catastrophic forgetting, hindering their ability to learn continuously like humans. To address this, continual learning (CL) methodologies have been introduced, allowing LMs to adapt to new tasks while retaining learned knowledge. However, a systematic taxonomy of existing approaches and a comparison of their performance are still lacking. In this article, we delve into a comprehensive review, summarization, and classification of the existing literature on CL-based approaches applied to foundation language models, such as pre-trained language models, large language models, and vision-language models. We divide these studies into offline and online CL, which consist of traditional methods, parameter-efficient-based methods, instruction tuning-based methods and continual pre-training methods. Additionally, we outline the typical datasets and metrics employed in CL research and provide a detailed analysis of the challenges and future work for LMs-based continual learning.
引用
收藏
页数:38
相关论文
共 233 条
[1]  
2023, Arxiv, DOI arXiv:2303.08774
[2]   Memory Aware Synapses: Learning What (not) to Forget [J].
Aljundi, Rahaf ;
Babiloni, Francesca ;
Elhoseiny, Mohamed ;
Rohrbach, Marcus ;
Tuytelaars, Tinne .
COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 :144-161
[3]  
[Anonymous], 2008, P 2008 INT C WEB SEA, DOI DOI 10.1145/1341531.1341561
[4]  
Brown TB, 2020, Arxiv, DOI [arXiv:2005.14165, DOI 10.48550/ARXIV.2005.14165]
[5]  
Bai Y, 2022, arXiv, DOI [10.48550/arxiv.2204.05862, DOI 10.48550/ARXIV.2204.05862]
[6]  
Beaulieu S, 2020, Arxiv, DOI arXiv:2002.09571
[7]   A comprehensive study of class incremental learning algorithms for visual tasks [J].
Belouadah, Eden ;
Popescu, Adrian ;
Kanellos, Ioannis .
NEURAL NETWORKS, 2021, 135 :38-54
[8]  
Biesialska K., 2020, P 28 INT C COMP LING, P6523, DOI [DOI 10.18653/V1/2020.COLINGMAIN.574, 10.18653/v1/2020.coling-main.574]
[9]  
Bin Tareaf Raad, 2017, HarvardDataverse, V2, DOI 10.7910/DVN/JBXKFD
[10]  
Blier L, 2018, ADV NEUR IN, V31