Continual Attentive Fusion for Incremental Learning in Semantic Segmentation

被引：16

作者：

Yang, Guanglei ^{[1
]}

Fini, Enrico ^{[2
]}

Xu, Dan ^{[3
]}

Rota, Paolo ^{[2
]}

Ding, Mingli ^{[1
]}

Tang, Hao ^{[4
]}

Alameda-Pineda, Xavier ^{[5
]}

Ricci, Elisa ^{[6
,7
]}

机构：

[1] Harbin Inst Technol HIT, Sch Instrument Sci & Engn, Harbin 150001, Peoples R China

[2] Univ Trento, Dept Informat Engn & Comp Sci, I-38123 Povo, Italy

[3] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong 999077, Peoples R China

[4] Swiss Fed Inst Technol, Dept Informat Technol & Elect Engn, CH-8092 Zurich, Switzerland

[5] INRIA, RobotLearn Grp, F-38330 Montbonnot St Martin, France

[6] Univ Trento, Dept Informat Engn & Comp Sci, I-38123 Povo, Italy

[7] Fdn Bruno Kessler, Deep Visual Learning Grp, I-38123 Trento, Italy

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2023年 / 25卷

基金：

欧盟地平线“2020”;

关键词：

Task analysis; Semantics; Image segmentation; Tensors; Feature extraction; Deep learning; Training; Incremental learning; knowledge distillation; semantic segmentation;

D O I：

10.1109/TMM.2022.3167555

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Over the past years, semantic segmentation, similar to many other tasks in computer vision, has benefited from the progress in deep neural networks, resulting in significantly improved performance. However, deep architectures trained with gradient-based techniques suffer from catastrophic forgetting, which is the tendency to forget previously learned knowledge while learning new tasks. Aiming at devising strategies to counteract this effect, incremental learning approaches have gained popularity over the past years. However, the first incremental learning methods for semantic segmentation appeared only recently. While effective, these approaches do not account for a crucial aspect in pixel-level dense prediction problems, i.e., the role of attention mechanisms. To fill this gap, in this paper, we introduce a novel attentive feature distillation approach to mitigate catastrophic forgetting while accounting for semantic spatial- and channellevel dependencies. Furthermore, we propose a continual attentive fusion structure, which takes advantage of the attention learned from the new and the old tasks while learning features for the new task. Finally, we also introduce a novel strategy to account for the background class in the distillation loss, thus preventing biased predictions. We demonstrate the effectiveness of our approach with an extensive evaluation on Pascal-VOC 2012 and ADE20 K, setting a new state of the art.

引用

页码：3841 / 3854

页数：14

共 65 条

[11] Learning without Memorizing [J].

Dhar, Prithviraj ;

Singh, Rajat Vikram ;

Peng, Kuan-Chuan ;

Wu, Ziyan ;

Chellappa, Rama .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5133-5141

[12] LANet: Local Attention Embedding to Improve the Semantic Segmentation of Remote Sensing Images [J].

Ding, Lei ;

Tang, Hao ;

Bruzzone, Lorenzo .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (01) :426-435

[13] PLOP: Learning without Forgetting for Continual Semantic Segmentation [J].

Douillard, Arthur ;

Chen, Yifu ;

Dapogny, Arnaud ;

Cord, Matthieu .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :4039-4049

[14] PODNet: Pooled Outputs Distillation for Small-Tasks Incremental Learning [J].

Douillard, Arthur ;

Cord, Matthieu ;

Ollion, Charles ;

Robert, Thomas ;

Valle, Eduardo .

COMPUTER VISION - ECCV 2020, PT XX, 2020, 12365 :86-102

[15] Audio-Visual Event Localization via Recursive Fusion by Joint Co-Attention [J].

Duan, Bin ;

Tang, Hao ;

Wang, Wei ;

Zong, Ziliang ;

Yang, Guowei ;

Yan, Yan .

2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, :4012-4021

[16] Cascade Attention Guided Residue Learning GAN for Cross-Modal Translation [J].

Duan, Bin ;

Wang, Wei ;

Tang, Hao ;

Latapie, Hugo ;

Yan, Yan .

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, :1336-1343

[17]

EVERINGHAM M, 2007, THE PASCAL VISUAL OB

[18] Online Continual Learning Under Extreme Memory Constraints [J].

Fini, Enrico ;

Lathuiliere, Stephane ;

Sangineto, Enver ;

Nabi, Moin ;

Ricci, Elisa .

COMPUTER VISION - ECCV 2020, PT XXVIII, 2020, 12373 :720-735

[19] Dual Attention Network for Scene Segmentation [J].

Fu, Jun ;

Liu, Jing ;

Tian, Haijie ;

Li, Yong ;

Bao, Yongjun ;

Fang, Zhiwei ;

Lu, Hanqing .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3141-3149

[20] Hierarchical Group-Level Emotion Recognition [J].

Fujii, Katsuya ;

Sugimura, Daisuke ;

Hamamoto, Takayuki .

IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 (23) :3892-3906

← 1 2 3 4 5 6 7 →