AdaDFKD: Exploring adaptive inter-sample relationship in data-free knowledge distillation

被引：0

作者：

Li, Jingru ^{[1
]}

Zhou, Sheng ^{[1
]}

Li, Liangcheng ^{[1
]}

Wang, Haishuai ^{[1
]}

Bu, Jiajun ^{[1
]}

Yu, Zhi ^{[1
,2
]}

机构：

[1] Zhejiang Univ, Coll Comp Sci & Technol, Zheda Rd, Hangzhou 310027, Zhejiang, Peoples R China

[2] Zhejiang Univ, Zhejiang Prov Key Lab Serv Robot, Zheda Rd, Hangzhou 310027, Zhejiang, Peoples R China

来源：

NEURAL NETWORKS | 2024年 / 177卷

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Data-free knowledge distillation; Unsupervised representation learning; Knowledge distillation;

D O I：

10.1016/j.neunet.2024.106386

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In scenarios like privacy protection or large-scale data transmission, data-free knowledge distillation (DFKD) methods are proposed to learn Knowledge Distillation (KD) when data is not accessible. They generate pseudo samples by extracting the knowledge from teacher model, and utilize above pseudo samples for KD. The challenge in previous DFKD methods lies in the static nature of their target distributions and they focus on learning the instance-level distributions, causing its reliance on the pretrained teacher model. To address above concerns, our study introduces a novel DFKD approach known as AdaDFKD, designed to establish and utilize relationships among pseudo samples, which is adaptive to the student model, and finally effectively mitigates the aforementioned risk. We achieve this by generating from "easy-to-discriminate"samples to "hardto-discriminate"samples as human does. We design a relationship refinement module (R2M) to optimize the generation process, wherein we learn a progressive conditional distribution of negative samples and maximize the log-likelihood of inter-sample similarity of pseudosamples. Theoretically, we discover that such design of AdaDFKD both minimize the divergence and maximize the mutual information between the distribution of teacher and student models. Above results demonstrate the superiority of our approach over state-of-the-art (SOTA) DFKD methods across various benchmarks, teacher-student pairs, and evaluation metrics, as well as robustness and fast convergence.

引用

页数：15

共 37 条

[21] IFHE: Intermediate-Feature Heterogeneity Enhancement for Image Synthesis in Data-Free Knowledge Distillation
Chen, Yi
Liu, Ning
Ren, Ao
Yang, Tao
Liu, Duo
2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
[22] Dual discriminator adversarial distillation for data-free model compression
Zhao, Haoran
Sun, Xin
Dong, Junyu
Manic, Milos
Zhou, Huiyu
Yu, Hui
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2022, 13 (05) : 1213 - 1230
[23] Dual discriminator adversarial distillation for data-free model compression
Haoran Zhao
Xin Sun
Junyu Dong
Milos Manic
Huiyu Zhou
Hui Yu
International Journal of Machine Learning and Cybernetics, 2022, 13 : 1213 - 1230
[24] Data-Free Low-Bit Quantization via Dynamic Multi-teacher Knowledge Distillation
Huang, Chong
Lin, Shaohui
Zhang, Yan
Li, Ke
Zhang, Baochang
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII, 2024, 14432 : 28 - 41
[25] Better Together: Data-Free Multi-Student Coevolved Distillation
Chen, Weijie
Xuan, Yunyi
Yang, Shicai
Xie, Di
Lin, Luojun
Zhuang, Yueting
KNOWLEDGE-BASED SYSTEMS, 2024, 283
[26] ENHANCING DATA-FREE ADVERSARIAL DISTILLATION WITH ACTIVATION REGULARIZATION AND VIRTUAL INTERPOLATION
Qu, Xiaoyang
Wang, Jianzong
Xiao, Jing
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3340 - 3344
[27] CDFKD-MFS: Collaborative Data-Free Knowledge Distillation via Multi-Level Feature Sharing
Hao, Zhiwei
Luo, Yong
Wang, Zhi
Hu, Han
An, Jianping
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4262 - 4274
[28] Exploring the Distributed Knowledge Congruence in Proxy-data-free Federated Distillation
Wu, Zhiyuan
Sun, Sheng
Wang, Yuwei
Liu, Min
Pan, Quyang
Zhang, Junbo
Li, Zeju
Liu, Qingxiang
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (02)
[29] Privacy-Preserving Student Learning with Differentially Private Data-Free Distillation
Liu, Bochao
Lu, Jianghu
Wang, Pengju
Zhang, Junjie
Zeng, Dan
Qian, Zhenxing
Ge, Shiming
2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2022,
[30] Relation-Guided Adversarial Learning for Data-Free Knowledge Transfer
Liang, Yingping
Fu, Ying
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, : 2868 - 2885

← 1 2 3 4 →