AdaDFKD: Exploring adaptive inter-sample relationship in data-free knowledge distillation

被引：0

作者：

Li, Jingru ^{[1
]}

Zhou, Sheng ^{[1
]}

Li, Liangcheng ^{[1
]}

Wang, Haishuai ^{[1
]}

Bu, Jiajun ^{[1
]}

Yu, Zhi ^{[1
,2
]}

机构：

[1] Zhejiang Univ, Coll Comp Sci & Technol, Zheda Rd, Hangzhou 310027, Zhejiang, Peoples R China

[2] Zhejiang Univ, Zhejiang Prov Key Lab Serv Robot, Zheda Rd, Hangzhou 310027, Zhejiang, Peoples R China

来源：

NEURAL NETWORKS | 2024年 / 177卷

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Data-free knowledge distillation; Unsupervised representation learning; Knowledge distillation;

D O I：

10.1016/j.neunet.2024.106386

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In scenarios like privacy protection or large-scale data transmission, data-free knowledge distillation (DFKD) methods are proposed to learn Knowledge Distillation (KD) when data is not accessible. They generate pseudo samples by extracting the knowledge from teacher model, and utilize above pseudo samples for KD. The challenge in previous DFKD methods lies in the static nature of their target distributions and they focus on learning the instance-level distributions, causing its reliance on the pretrained teacher model. To address above concerns, our study introduces a novel DFKD approach known as AdaDFKD, designed to establish and utilize relationships among pseudo samples, which is adaptive to the student model, and finally effectively mitigates the aforementioned risk. We achieve this by generating from "easy-to-discriminate"samples to "hardto-discriminate"samples as human does. We design a relationship refinement module (R2M) to optimize the generation process, wherein we learn a progressive conditional distribution of negative samples and maximize the log-likelihood of inter-sample similarity of pseudosamples. Theoretically, we discover that such design of AdaDFKD both minimize the divergence and maximize the mutual information between the distribution of teacher and student models. Above results demonstrate the superiority of our approach over state-of-the-art (SOTA) DFKD methods across various benchmarks, teacher-student pairs, and evaluation metrics, as well as robustness and fast convergence.

引用

页数：15

共 37 条

[1] Data-free knowledge distillation in neural networks for regression
Kang, Myeonginn
Kang, Seokho
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 175
[2] Conditional generative data-free knowledge distillation
Yu, Xinyi
Yan, Ling
Yang, Yang
Zhou, Libo
Ou, Linlin
IMAGE AND VISION COMPUTING, 2023, 131
[3] Double-Generators Network for Data-Free Knowledge Distillation
Zhang J.
Ju J.
Ren Y.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (07): : 1615 - 1627
[4] Variational Data-Free Knowledge Distillation for Continual Learning
Li, Xiaorong
Wang, Shipeng
Sun, Jian
Xu, Zongben
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12618 - 12634
[5] ROBUSTNESS AND DIVERSITY SEEKING DATA-FREE KNOWLEDGE DISTILLATION
Han, Pengchao
Park, Jihong
Wang, Shiqiang
Liu, Yejun
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2740 - 2744
[6] Synthetic data generation method for data-free knowledge distillation in regression neural networks
Zhou, Tianxun
Chiam, Keng-Hwee
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 227
[7] Parameterized data-free knowledge distillation for heterogeneous federated learning
Guo, Cheng
He, Qianqian
Tang, Xinyu
Liu, Yining
Jie, Yingmo
KNOWLEDGE-BASED SYSTEMS, 2025, 317
[8] Unpacking the Gap Box Against Data-Free Knowledge Distillation
Wang, Yang
Qian, Biao
Liu, Haipeng
Rui, Yong
Wang, Meng
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (09) : 6280 - 6291
[9] Data-free Knowledge Distillation based on GNN for Node Classification
Zeng, Xinfeng
Liu, Tao
Zeng, Ming
Wu, Qingqiang
Wang, Meihong
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2024, PT 2, 2025, 14851 : 243 - 258
[10] Dynamic data-free knowledge distillation by easy-to-hard learning strategy
Li, Jingru
Zhou, Sheng
Li, Liangcheng
Wang, Haishuai
Bu, Jiajun
Yu, Zhi
INFORMATION SCIENCES, 2023, 642

← 1 2 3 4 →