Deep Learning for Class-Incremental Learning: A Survey

被引:0
作者
Zhou D.-W. [1 ]
Wang F.-Y. [1 ]
Ye H.-J. [1 ]
Zhan D.-C. [1 ]
机构
[1] State Key Laboratory Jor Novel Software Technology, Nanjing University, Nanjing
来源
Jisuanji Xuebao/Chinese Journal of Computers | 2023年 / 46卷 / 08期
关键词
catastrophic forgetting; class-incremental learning; continual learning; dynamic environment; model reuse;
D O I
10.11897/SP.J.1016.2023.01577
中图分类号
学科分类号
摘要
Recent years have witnessed the progress of deep learning in many fields, e. g., image classification and face recognition. Current deep models are deployed under the static environment, which requires collecting all the training data before the learning process. The deep model is unable to conduct further updating processes when the training process is terminated. However, data in the real world often come in stream format, which contains incoming instances from new classes. For example, in the opinion monitoring system, new topics will emerge as time goes by; in the electronic commerce platform, new types of products will arise day by day; in the robot learning scenario, the robot is required to learn new orders continually. As a result, an ideal model should learn from stream data and enhance its learning ability incrementally. Such a learning process, namely Class-Incremental Learning (CIL), is now drawing more and more attention from the machine learning community. Directly updating the incremental model with new class data will cause the forgetting of old ones and destroy the total performance, which is denoted as catastrophic forgetting in literature. As a result, the class-incremental learning model should incorporate new classes and meanwhile resist catastrophic forgetting over old ones. In this paper, we summarize and classify recent deep-leering-based class-incremental learning algorithms from three aspects, i. e., input, parameter, and algorithm. Typical class-incremental learning methods from the input aspect try to solve incremental learning tasks by regularizing and rehearsing the exemplar set, which can be divided into data replay-based and data restriction-based methods. Similar to the human learning process, data replay-based CIL methods aim to replay former instances when learning new ones, which obtain a trade-off between learning new knowledge and remembering old ones. Data restriction-based methods utilize the former examples as the regularization to restrict the direction of model updating. Class-incremental learning methods from the parameter aspect try to solve incremental learning tasks by regularizing model updating and adjusting network structure, which can be divided into parameter regularization-based and dynamic architecture-based methods. Parameter regularization-based methods weigh the importance of each parameter and restrict important parameters from being changed to overcome forgetting. On the other hand, dynamic architecture-based methods aim to dynamically adjust the network structure to meet the requirements of incoming new classes. Class-incremental learning methods from the algorithm aspect try to solve incremental learning tasks by model mapping and reducing inductive bias, which can be divided into knowledge distillation-based and post-tuning-based methods. Knowledge distillation-based CIL methods utilize the former model as the teacher to restrict the updating process of the current model. Post-tuning-based methods try to reduce the bias in the incremental model to get an unbiased prediction. In this paper, we conduct extensive experimental verification with ten typical algorithms on the benchmark datasets, i. e., CIFAR 100 and ImageNet ILSVRC2012. We analyze the behaviors of incremental models, including the accuracy trend, running time, memory budget, performance decay, and confusion matrix. We also summarize the common rules for class-incremental learning algorithms. Finally, we analyze the challenges and future trends and conclude this paper. © 2023 Science Press. All rights reserved.
引用
收藏
页码:1577 / 1605
页数:28
相关论文
共 215 条
[1]  
Xu M, Guo L Z., Learning from group supervision
[2]  
The impact of supervision deficiency on multi-label learning, Science China Information Sciences, 64, 3, (2021)
[3]  
Yang C, Liu G, Yan C, Et al., A clustering-based flexible weighting method in adaboost and its application to transaction fraud detection, Science China Information Sciences, 64, pp. 1-11, (2021)
[4]  
Shang T, Zhao Z, Ren X, Et al., Differential identifiability clustering algorithms for big data analysis, Science China Information Sciences, 64, 5, pp. 1-18, (2021)
[5]  
Wang Z, Liu X, Lin J, Et al., Multi-attention based cross-domain beauty product image retrieval, Science China Information Sciences, 63, 2, pp. 1-3, (2020)
[6]  
Cheng W, Cai R, Zeng L, Et al., IMCI
[7]  
An efficient fingerprint retrieval approach based on 3D stacked memory, Science China Information Sciences, 63, 7, pp. 1-3, (2020)
[8]  
Li G, Liu H, Li G, Et al., LSTM-based argument recommendation for non-API methods, Science China Information Sciences, 63, 9, pp. 1-22, (2020)
[9]  
Kong X, Han W, Liao L, Et al., An analysis of correctness for API recommendation
[10]  
Are the unmatched results u.sele.ss?, Science China Information Sciences, 63, 9, pp. 1-15, (2020)