Class-Prompting Transformer for Incremental Semantic Segmentation

被引:0
作者
Song, Zichen [1 ]
Shi, Zhaofeng [1 ]
Shang, Chao [1 ]
Meng, Fanman [1 ]
Xu, Linfeng [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Peoples R China
基金
中国国家自然科学基金;
关键词
Transformers; Task analysis; Semantic segmentation; Visualization; Semantics; Decoding; Computational modeling; Incremental semantic segmentation; knowledge distillation; class prompt learning;
D O I
10.1109/ACCESS.2023.3315327
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Class-incremental Semantic Segmentation (CISS) aims to learn new tasks sequentially that assign a specific category to each pixel of a given image while preserving the original capability to segment the old classes even if the labels of old tasks are absent. Most existing CISS methods suppress catastrophic forgetting by directly distilling on specific layers, which ignores the semantic gap between training data from the old and new classes with different distributions and leads to distillation errors, thus affecting segmentation performance. In this paper, we propose a Class-prompting Transformer (CPT) to introduce external prior knowledge provided by a pre-trained vision-language encoder into CISS pipelines for bridging the old and new classes and performing more generalized initialization and distillation. Specifically, we proposed a Prompt-guided Initialization Module (PIM), which measures the relationships between the class prompts and old query parameters to initialize the new query parameters for relocating the previous knowledge to the learning of new tasks. Then, a Semantic-aligned Distillation Module (SDM) is proposed to incorporate class prompt information with the class-aware embeddings extracted from the decoder to prevent the semantic gap problem between distinct class data and conduct adaptive knowledge transfer to suppress catastrophic forgetting. Extensive experiments on Pascal VOC and ADE20K datasets for the CISS task demonstrate the superiority of the proposed method, which achieves state-of-the-art performance.
引用
收藏
页码:100154 / 100164
页数:11
相关论文
共 60 条
  • [1] Memory Aware Synapses: Learning What (not) to Forget
    Aljundi, Rahaf
    Babiloni, Francesca
    Elhoseiny, Mohamed
    Rohrbach, Marcus
    Tuytelaars, Tinne
    [J]. COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 : 144 - 161
  • [2] Badrinarayanan V, 2016, Arxiv, DOI arXiv:1511.00561
  • [3] COCO-Stuff: Thing and Stuff Classes in Context
    Caesar, Holger
    Uijlings, Jasper
    Ferrari, Vittorio
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1209 - 1218
  • [4] End-to-End Incremental Learning
    Castro, Francisco M.
    Marin-Jimenez, Manuel J.
    Guil, Nicolas
    Schmid, Cordelia
    Alahari, Karteek
    [J]. COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 : 241 - 257
  • [5] CoMFormer: Continual Learning in Semantic and Panoptic Segmentation
    Cermelli, Fabio
    Cord, Matthieu
    Douillard, Arthur
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3010 - 3020
  • [6] Modeling the Background for Incremental Learning in Semantic Segmentation
    Cermelli, Fabio
    Mancini, Massimiliano
    Bulo, Samuel Rota
    Ricci, Elisa
    Caputo, Barbara
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 9230 - 9239
  • [7] Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence
    Chaudhry, Arslan
    Dokania, Puneet K.
    Ajanthan, Thalaiyasingam
    Torr, Philip H. S.
    [J]. COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 : 556 - 572
  • [8] Chen LC, 2016, Arxiv, DOI arXiv:1412.7062
  • [9] Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
  • [10] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851