Better Together: Data-Free Multi-Student Coevolved Distillation

被引:1
|
作者
Chen, Weijie [1 ,2 ]
Xuan, Yunyi [2 ]
Yang, Shicai [2 ]
Xie, Di [2 ]
Lin, Luojun [3 ]
Zhuang, Yueting [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China
[2] Hikvis Res Inst, Hangzhou, Peoples R China
[3] Fuzhou Univ, Coll Comp & Data Sci, Fuzhou, Peoples R China
基金
国家重点研发计划;
关键词
Knowledge distillation; Adversarial training; Model inversion; Surrogate images; Mutual learning;
D O I
10.1016/j.knosys.2023.111146
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data-Free Knowledge Distillation (DFKD) aims to craft a customized student model from a pre-trained teacher model by synthesizing surrogate training images. However, a seldom-investigated scenario is to distill the knowledge to multiple heterogeneous students simultaneously. In this paper, we aim to study how to improve the performance by coevolving peer students, termed Data-Free Multi-Student Coevolved Distillation (DF-MSCD). Based on previous DFKD methods, we advance DF-MSCD by improving the data quality from the perspective of synthesizing unbiased, informative and diverse surrogate samples: 1) Unbiased. The disconnection of image synthesis among different timestamps during DFKD will lead to an unnoticed class imbalance problem. To tackle this problem, we reform the prior art into an unbiased variant by bridging the label distribution of the synthesized data among different timestamps. 2) Informative. Different from single-student DFKD, we encourage the interactions not only between teacher-student pairs, but also within peer students, driving a more comprehensive knowledge distillation. To this end, we devise a novel Inter-Student Adversarial Learning method to coevolve peer students with mutual benefits. 3) Diverse. To further promote Inter-Student Adversarial Learning, we develop Mixture-of-Generators, in which multiple generators are optimized to synthesize different yet complementary samples by playing min-max games with multiple students. Experiments are conducted to validate the effectiveness and efficiency of the proposed DF-MSCD, surpassing the existing state-of-the-arts on multiple popular benchmarks. To emphasize, our method can obtain heterogeneous students by training once, which is superior to single-student DFKD methods in terms of both training time and testing accuracy.
引用
收藏
页数:13
相关论文
共 25 条
  • [1] Multi-student Collaborative Self-supervised Distillation
    Yang, Yinan
    Chen, Li
    Wu, Shaohui
    Sun, Zhuang
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT II, 2023, 14087 : 199 - 210
  • [2] Privacy-Preserving Student Learning with Differentially Private Data-Free Distillation
    Liu, Bochao
    Lu, Jianghu
    Wang, Pengju
    Zhang, Junjie
    Zeng, Dan
    Qian, Zhenxing
    Ge, Shiming
    2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2022,
  • [3] Memory efficient data-free distillation for continual learning
    Li, Xiaorong
    Wang, Shipeng
    Sun, Jian
    Xu, Zongben
    PATTERN RECOGNITION, 2023, 144
  • [4] ROBUSTNESS AND DIVERSITY SEEKING DATA-FREE KNOWLEDGE DISTILLATION
    Han, Pengchao
    Park, Jihong
    Wang, Shiqiang
    Liu, Yejun
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2740 - 2744
  • [5] Data-free knowledge distillation in neural networks for regression
    Kang, Myeonginn
    Kang, Seokho
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 175
  • [6] Data-free Knowledge Distillation for Reusing Recommendation Models
    Wang, Cheng
    Sun, Jiacheng
    Dong, Zhenhua
    Zhu, Jieming
    Li, Zhenguo
    Li, Ruixuan
    Zhang, Rui
    PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023, 2023, : 386 - 395
  • [7] Double-Generators Network for Data-Free Knowledge Distillation
    Zhang J.
    Ju J.
    Ren Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (07): : 1615 - 1627
  • [8] Dual discriminator adversarial distillation for data-free model compression
    Zhao, Haoran
    Sun, Xin
    Dong, Junyu
    Manic, Milos
    Zhou, Huiyu
    Yu, Hui
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2022, 13 (05) : 1213 - 1230
  • [9] Parameterized data-free knowledge distillation for heterogeneous federated learning
    Guo, Cheng
    He, Qianqian
    Tang, Xinyu
    Liu, Yining
    Jie, Yingmo
    KNOWLEDGE-BASED SYSTEMS, 2025, 317
  • [10] Dual discriminator adversarial distillation for data-free model compression
    Haoran Zhao
    Xin Sun
    Junyu Dong
    Milos Manic
    Huiyu Zhou
    Hui Yu
    International Journal of Machine Learning and Cybernetics, 2022, 13 : 1213 - 1230