Subspace distillation for continual learning

被引:7
作者
Roy, Kaushik [1 ,3 ]
Simon, Christian [1 ,2 ,3 ]
Moghadam, Peyman [3 ,4 ]
Harandi, Mehrtash [1 ,3 ]
机构
[1] Monash Univ, Melbourne, Vic, Australia
[2] Australian Natl Univ, Canberra, ACT, Australia
[3] CSIRO, Data61, Brisbane, Qld, Australia
[4] Queensland Univ Technol, Brisbane, Qld, Australia
基金
澳大利亚研究理事会;
关键词
Lifelong learning; Subspace distillation; Knowledge distillation; Continual semantic segmentation; Catastrophic forgetting; Background shift;
D O I
10.1016/j.neunet.2023.07.047
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An ultimate objective in continual learning is to preserve knowledge learned in preceding tasks while learning new tasks. To mitigate forgetting prior knowledge, we propose a novel knowledge distillation technique that takes into the account the manifold structure of the latent/output space of a neural network in learning novel tasks. To achieve this, we propose to approximate the data manifold up to its first order, hence benefiting from linear subspaces to model the structure and maintain the knowledge of a neural network while learning novel concepts. We demonstrate that the modeling with subspaces provides several intriguing properties, including robustness to noise and therefore effective for mitigating Catastrophic Forgetting in continual learning. We also discuss and show how our proposed method can be adopted to address both classification and segmentation problems. Empirically, we observe that our proposed method outperforms various continual learning methods on several challenging datasets including Pascal VOC, and Tiny-Imagenet. Furthermore, we show how the proposed method can be seamlessly combined with existing learning approaches to improve their performances. The codes of this article will be available at https://github.com/csiro-robotics/SDCL.& COPY; 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页码:65 / 79
页数:15
相关论文
共 66 条
[1]  
Rusu AA, 2016, Arxiv, DOI arXiv:1606.04671
[2]  
Aljundi R, 2019, ADV NEUR IN, V32
[3]  
Aljundi R, 2019, ADV NEUR IN, V32
[4]   Memory Aware Synapses: Learning What (not) to Forget [J].
Aljundi, Rahaf ;
Babiloni, Francesca ;
Elhoseiny, Mohamed ;
Rohrbach, Marcus ;
Tuytelaars, Tinne .
COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 :144-161
[5]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[6]   In-Place Activated BatchNorm for Memory-Optimized Training of DNNs [J].
Bulo, Samuel Rota ;
Porzi, Lorenzo ;
Kontschieder, Peter .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5639-5647
[7]  
Buzzega P., 2020, Advances in neural information processing systems, V33, P15920
[8]  
Cermelli F, 2021, Arxiv, DOI arXiv:2012.01415
[9]   Modeling the Background for Incremental Learning in Semantic Segmentation [J].
Cermelli, Fabio ;
Mancini, Massimiliano ;
Bulo, Samuel Rota ;
Ricci, Elisa ;
Caputo, Barbara .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :9230-9239
[10]  
Cha S., 2021, Advances in neural information pro- cessing systems, V34