Projected Latent Distillation for Data-Agnostic Consolidation in distributed continual learning

被引：2

作者：

Carta, Antonio ^{[1
]}

Cossu, Andrea ^{[1
]}

Lomonaco, Vincenzo ^{[1
]}

Bacciu, Davide ^{[1
]}

van de Weijer, Joost ^{[2
]}

机构：

[1] Univ Pisa, Dept Comp Sci, Pisa, Italy

[2] Comp Vis Ctr, Barcelona, Spain

来源：

NEUROCOMPUTING | 2024年 / 598卷

关键词：

Continual learning; Model consolidation; Distributed continual learning;

D O I：

10.1016/j.neucom.2024.127935

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In continual learning applications on -the -edge multiple self-centered devices (SCD) learn different local tasks independently, with each SCD only optimizing its own task. Can we achieve (almost) zero -cost collaboration between different devices? We formalize this problem as a Distributed Continual Learning (DCL) scenario, where SCDs greedily adapt to their own local tasks and a separate continual learning (CL) model perform a sparse and asynchronous consolidation step that combines the SCD models sequentially into a single multi -task model without using the original data. Unfortunately, current CL methods are not directly applicable to this scenario. We propose Data -Agnostic Consolidation (DAC), a novel double knowledge distillation method which performs distillation in the latent space via a novel Projected Latent Distillation loss. Experimental results show that DAC enables forward transfer between SCDs and reaches state-of-the-art accuracy on Split CIFAR100, CORe50 and Split TinyImageNet, both in single device and distributed CL scenarios. Somewhat surprisingly, a single out -of -distribution image is sufficient as the only source of data for DAC.

引用

页数：9

共 38 条

[1] Asano YM, 2023, Arxiv, DOI arXiv:2112.00725
[2] Baradad Jurjo M., 2021, Advances in Neural Information Processing Systems 34 (NeurIPS 2021), V34, P2556
[3] Knowledge distillation: A good teacher is patient and consistent
Beyer, Lucas
Zhai, Xiaohua
Royer, Amelie
Markeeva, Larisa
Anil, Rohan
Kolesnikov, Alexander
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10915 - 10924
[4] Buzzega P., 2020, P ADV NEUR INF PROC, V33, P15920
[5] Ex-Model: Continual Learning from a Stream of Trained Models
Carta, Antonio
Cossu, Andrea
Lomonaco, Vincenzo
Bacciu, Davide
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3789 - 3798
[6] Chrabaszcz P, 2017, Arxiv, DOI arXiv:1707.08819
[7] A Continual Learning Survey: Defying Forgetting in Classification Tasks
De Lange, Matthias
Aljundi, Rahaf
Masana, Marc
Parisot, Sarah
Jia, Xu
Leonardis, Ales
Slabaugh, Greg
Tuytelaars, Tinne
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3366 - 3385
[8] Dong N, 2021, ADV NEUR IN, V34
[9] Fang G., 2021, 35 C NEUR INF PROC S
[10] Catastrophic forgetting in connectionist networks
French, RM
[J]. TRENDS IN COGNITIVE SCIENCES, 1999, 3 (04) : 128 - 135

← 1 2 3 4 →