Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need

被引：5

作者：

Zhou, Da-Wei ^{[1
,2
]}

Cai, Zi-Wen ^{[1
,2
]}

Ye, Han-Jia ^{[1
,2
]}

Zhan, De-Chuan ^{[1
,2
]}

Liu, Ziwei ^{[3
]}

机构：

[1] Nanjing Univ, Sch Artificial Intelligence, Nanjing 210023, Peoples R China

[2] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China

[3] Nanyang Technol Univ, S Lab, Singapore City 639798, Singapore

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2025年 / 133卷 / 03期

关键词：

Class-incremental learning; Pre-trained models; Continual learning; Catastrophic forgetting; REPRESENTATION;

D O I：

10.1007/s11263-024-02218-0

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge (Aper), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM's generalizability and adapted model's adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL.

引用

页码：1012 / 1032

页数：21

共 20 条

[1] Adapt and Refine: A Few-Shot Class-Incremental Learner via Pre-Trained Models
Qiang, Sunyuan
Xiong, Zhu
Liang, Yanyan
Wan, Jun
Zhang, Du
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT 1, 2025, 15031 : 431 - 444
[2] iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental Learning
Fischer, Tom
Liu, Yaoyao
Jesslen, Artur
Ahmed, Noor
Kaushik, Prakhar
Wang, Angtian
Yuille, Alan L.
Kortylewski, Adam
Ilg, Eddy
COMPUTER VISION - ECCV 2024, PT LXXVII, 2024, 15135 : 357 - 374
[3] Knowledge Representation by Generic Models for Few-Shot Class-Incremental Learning
Chen, Xiaodong
Jiang, Weijie
Huang, Zhiyong
Su, Jiangwen
Yu, Yuanlong
ADVANCES IN NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, ICNC-FSKD 2022, 2023, 153 : 1237 - 1247
[4] Simple and Effective Multimodal Learning Based on Pre-Trained Transformer Models
Miyazawa, Kazuki
Kyuragi, Yuta
Nagai, Takayuki
IEEE ACCESS, 2022, 10 : 29821 - 29833
[5] Unleashing the Class-Incremental Learning Potential of Foundation Models by Virtual Feature Generation and Replay
Xun, Tianci
Zheng, Zhong
He, Yulin
Chen, Wei
Zheng, Weiwei
PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 453 - 467
[6] TARGET SPEECH EXTRACTION WITH PRE-TRAINED SELF-SUPERVISED LEARNING MODELS
Peng, Junyi
Delcroix, Marc
Ochiai, Tsubasa
Plchot, Oldrich
Araki, Shoko
Cemocky, Jan
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024), 2024, : 10421 - 10425
[7] Efficient utilization of pre-trained models: A review of sentiment analysis via prompt learning
Bu, Kun
Liu, Yuanchao
Ju, Xiaolong
KNOWLEDGE-BASED SYSTEMS, 2024, 283
[8] DenseNet-201 and Xception Pre-Trained Deep Learning Models for Fruit Recognition
Salim, Farsana
Saeed, Faisal
Basurra, Shadi
Qasem, Sultan Noman
Al-Hadhrami, Tawfik
ELECTRONICS, 2023, 12 (14)
[9] Big dermatological data service for precise and immediate diagnosis by utilizing pre-trained learning models
Elbes, Mohammed
AlZu'bi, Shadi
Kanan, Tarek
Mughaid, Ala
Abushanab, Samia
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (05): : 6931 - 6951
[10] On the Usage of Continual Learning for Out-of-Distribution Generalization in Pre-trained Language Models of Code
Weyssow, Martin
Zhou, Xin
Kim, Kisub
Lo, David
Sahraoui, Houari
PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 1470 - 1482

← 1 2 →