Continual Learning From a Stream of APIs

被引：0

作者：

Yang, Enneng ^{[1
]}

Wang, Zhenyi ^{[2
]}

Shen, Li ^{[3
,4
]}

Yin, Nan ^{[5
]}

Liu, Tongliang ^{[6
]}

Guo, Guibing ^{[1
]}

Wang, Xingwei ^{[1
]}

Tao, Dacheng ^{[7
]}

机构：

[1] Northeastern Univ, Shenyang 110004, Peoples R China

[2] Univ Maryland, College Pk, MD 20742 USA

[3] Sun Yat Sen Univ, Guangzhou 510275, Peoples R China

[4] JD Explore Acad, Beijing 101111, Peoples R China

[5] Mohamed bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates

[6] Univ Sydney, Camperdown, NSW 2050, Australia

[7] Nanyang Technol Univ, Singapore 639798, Singapore

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 12期

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Data-free learning; catastrophic forgetting; plasticity-stability; continual learning; NEURAL-NETWORKS; GAME; GO;

D O I：

10.1109/TPAMI.2024.3460871

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Continual learning (CL) aims to learn new tasks without forgetting previous tasks. However, existing CL methods require a large amount of raw data, which is often unavailable due to copyright considerations and privacy risks. Instead, stakeholders usually release pre-trained machine learning models as a service (MLaaS), which users can access via APIs. This paper considers two practical-yet-novel CL settings: data-efficient CL (DECL-APIs) and data-free CL (DFCL-APIs), which achieve CL from a stream of APIs with partial or no raw data. Performing CL under these two new settings faces several challenges: unavailable full raw data, unknown model parameters, heterogeneous models of arbitrary architecture and scale, and catastrophic forgetting of previous APIs. To overcome these issues, we propose a novel data-free cooperative continual distillation learning framework that distills knowledge from a stream of APIs into a CL model by generating pseudo data, just by querying APIs. Specifically, our framework includes two cooperative generators and one CL model, forming their training as an adversarial game. We first use the CL model and the current API as fixed discriminators to train generators via a derivative-free method. Generators adversarially generate hard and diverse synthetic data to maximize the response gap between the CL model and the API. Next, we train the CL model by minimizing the gap between the responses of the CL model and the black-box API on synthetic data, to transfer the API's knowledge to the CL model. Furthermore, we propose a new regularization term based on network similarity to prevent catastrophic forgetting of previous APIs. Our method performs comparably to classic CL with full raw data on the MNIST and SVHN datasets in the DFCL-APIs setting. In the DECL-APIs setting, our method achieves 0.97x, 0.75x and 0.69x performance of classic CL on the more challenging CIFAR10, CIFAR100, and MiniImageNet, respectively.

引用

页码：11432 / 11445

页数：14

共 50 条

[41] Continual Learning Using Bayesian Neural Networks
Li, Honglin
Barnaghi, Payam
Enshaeifare, Shirin
Ganz, Frieder
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (09) : 4243 - 4252
[42] Generalised Controller Design Using Continual Learning
Benavides-Prado, Diana
Wanigasekara, Chathura
Swain, Akshya
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT II, 2021, 12892 : 397 - 408
[43] Interactive continual learning for robots: a neuromorphic approach
Hajizada, Elvin
Berggold, Patrick
Iacono, Massimiliano
Glover, Arren
Sandamirskaya, Yulia
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NEUROMORPHIC SYSTEMS 2022, ICONS 2022, 2022,
[44] Development of a Framework for Continual Learning in Industrial Robotics
Minh Trinh
Moon, Jiyoung
Gruendel, Lukas
Hankemeier, Victoria
Storms, Simon
Brecher, Christian
2022 IEEE 27TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA), 2022,
[45] Toward Digits Recognition Using Continual Learning
Kharrat, Asma
Drira, Fadoua
Lebourgeois, Franck
Garcia, Christophe
2023 IEEE 25TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, MMSP, 2023,
[46] Enabling Continual Learning with Differentiable Hebbian Plasticity
Thangarasa, Vithursan
Miconi, Thomas
Taylor, Graham W.
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[47] Online Continual Learning for Control of Mobile Robots
Sarabakha, Andriy
Qiao, Zhongzheng
Ramasamy, Savitha
Suganthan, Ponnuthurai Nagaratnam
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[48] A survey and perspective on neuromorphic continual learning systems
Mishra, Richa
Suri, Manan
FRONTIERS IN NEUROSCIENCE, 2023, 17
[49] Computationally Efficient Rehearsal for Online Continual Learning
Davalas, Charalampos
Michail, Dimitrios
Diou, Christos
Varlamis, Iraklis
Tserpes, Konstantinos
IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT III, 2022, 13233 : 39 - 49
[50] Continual learning for predictive maintenance: Overview and challenges
Hurtado, Julio
Salvati, Dario
Semola, Rudy
Bosio, Mattia
Lomonaco, Vincenzo
INTELLIGENT SYSTEMS WITH APPLICATIONS, 2023, 19

← 1 2 3 4 5 →