Projected Latent Distillation for Data-Agnostic Consolidation in distributed continual learning

被引:2
作者
Carta, Antonio [1 ]
Cossu, Andrea [1 ]
Lomonaco, Vincenzo [1 ]
Bacciu, Davide [1 ]
van de Weijer, Joost [2 ]
机构
[1] Univ Pisa, Dept Comp Sci, Pisa, Italy
[2] Comp Vis Ctr, Barcelona, Spain
关键词
Continual learning; Model consolidation; Distributed continual learning;
D O I
10.1016/j.neucom.2024.127935
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In continual learning applications on -the -edge multiple self-centered devices (SCD) learn different local tasks independently, with each SCD only optimizing its own task. Can we achieve (almost) zero -cost collaboration between different devices? We formalize this problem as a Distributed Continual Learning (DCL) scenario, where SCDs greedily adapt to their own local tasks and a separate continual learning (CL) model perform a sparse and asynchronous consolidation step that combines the SCD models sequentially into a single multi -task model without using the original data. Unfortunately, current CL methods are not directly applicable to this scenario. We propose Data -Agnostic Consolidation (DAC), a novel double knowledge distillation method which performs distillation in the latent space via a novel Projected Latent Distillation loss. Experimental results show that DAC enables forward transfer between SCDs and reaches state-of-the-art accuracy on Split CIFAR100, CORe50 and Split TinyImageNet, both in single device and distributed CL scenarios. Somewhat surprisingly, a single out -of -distribution image is sufficient as the only source of data for DAC.
引用
收藏
页数:9
相关论文
共 38 条
  • [11] Howard AG, 2017, Arxiv, DOI arXiv:1704.04861
  • [12] Continually Learning Self-Supervised Representations with Projected Functional Regularization
    Gomez-Villa, Alex
    Twardowski, Bartlomiej
    Yu, Lu
    Bagdanov, Andrew D.
    van de Weijer, Joost
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3866 - 3876
  • [13] Multi-target Knowledge Distillation via Student Self-reflection
    Gou, Jianping
    Xiong, Xiangshuo
    Yu, Baosheng
    Du, Lan
    Zhan, Yibing
    Tao, Dacheng
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (07) : 1857 - 1874
  • [14] Multilevel Attention-Based Sample Correlations for Knowledge Distillation
    Gou, Jianping
    Sun, Liyuan
    Yu, Baosheng
    Wan, Shaohua
    Ou, Weihua
    Yi, Zhang
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (05) : 7099 - 7109
  • [15] Hinton G, 2015, Arxiv, DOI arXiv:1503.02531
  • [16] Konishi T., 2023, P INT C MACH LEARN, P17492
  • [17] Krizhevsky A., 2009, LEARNING MULTIPLE LA
  • [18] Lee KB, 2019, Arxiv, DOI [arXiv:1903.12648, 10.48550/arXiv.1903.12648, DOI 10.48550/ARXIV.1903.12648]
  • [19] Legate G, 2023, Arxiv, DOI [arXiv:2304.05260, DOI 10.48550/ARXIV.2304.05260]
  • [20] Continual learning for robotics: Definition, framework, learning strategies, opportunities and challenges
    Lesort, Timothee
    Lomonaco, Vincenzo
    Stoian, Andrei
    Maltoni, Davide
    Filliat, David
    Diaz-Rodriguez, Natalia
    [J]. INFORMATION FUSION, 2020, 58 : 52 - 68