Enhancing Visual Continual Learning with Language-Guided Supervision

被引:3
作者
Ni, Bolin [1 ,2 ]
Zhao, Hongbo [1 ,2 ]
Zhang, Chenghao [1 ,2 ]
Hu, Ke [2 ]
Meng, Gaofeng [1 ,2 ,3 ]
Zhang, Zhaoxiang [1 ,2 ,3 ]
Xiang, Shiming [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
[3] Chinese Acad Sci, Ctr Artificial Intelligence & Robot, HK Inst Sci & Innovat, Beijing, Peoples R China
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2024年
关键词
D O I
10.1109/CVPR52733.2024.02272
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Continual learning (CL) aims to empower models to learn new tasks without forgetting previously acquired knowledge. Most prior works concentrate on the techniques of architectures, replay data, regularization, etc. However, the category name of each class is largely neglected. Existing methods commonly utilize the one-hot labels and randomly initialize the classifier head. We argue that the scarce semantic information conveyed by the one-hot labels hampers the effective knowledge transfer across tasks. In this paper, we revisit the role of the classifier head within the CL paradigm and replace the classifier with semantic knowledge from pretrained language models (PLMs). Specifically, we use PLMs to generate semantic targets for each class, which are frozen and serve as supervision signals during training. Such targets fully consider the semantic correlation between all classes across tasks. Empirical studies show that our approach mitigates forgetting by alleviating representation drifting and facilitating knowledge transfer across tasks. The proposed method is simple to implement and can seamlessly be plugged into existing methods with negligible adjustments. Extensive experiments based on eleven mainstream baselines demonstrate the effectiveness and generalizability of our approach to various protocols. For example, under the class-incremental learning setting on ImageNet-100, our method significantly improves the Top-1 accuracy by 3.2% to 6.1% while reducing the forgetting rate by 2.6% to 13.1%.
引用
收藏
页码:24068 / 24077
页数:10
相关论文
共 53 条
[1]  
Rusu AA, 2016, Arxiv, DOI arXiv:1606.04671
[2]  
Alayrac JB, 2022, ADV NEUR IN
[3]  
Aljundi R, 2019, ADV NEUR IN, V32
[4]   Memory Aware Synapses: Learning What (not) to Forget [J].
Aljundi, Rahaf ;
Babiloni, Francesca ;
Elhoseiny, Mohamed ;
Rohrbach, Marcus ;
Tuytelaars, Tinne .
COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 :144-161
[5]   IL2M: Class Incremental Learning With Dual Memory [J].
Belouadah, Eden ;
Popescu, Adrian .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :583-592
[6]  
Brown TB, 2020, ADV NEUR IN, V33
[7]   End-to-End Incremental Learning [J].
Castro, Francisco M. ;
Marin-Jimenez, Manuel J. ;
Guil, Nicolas ;
Schmid, Cordelia ;
Alahari, Karteek .
COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 :241-257
[8]   Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence [J].
Chaudhry, Arslan ;
Dokania, Puneet K. ;
Ajanthan, Thalaiyasingam ;
Torr, Philip H. S. .
COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 :556-572
[9]  
Chen Z, 2023, AAAI CONF ARTIF INTE, P405
[10]   Reproducible scaling laws for contrastive language-image learning [J].
Cherti, Mehdi ;
Beaumont, Romain ;
Wightman, Ross ;
Wortsman, Mitchell ;
Ilharco, Gabriel ;
Gordon, Cade ;
Schuhmann, Christoph ;
Schmidt, Ludwig ;
Jitsevi, Jenia .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :2818-2829