Dual discriminator adversarial distillation for data-free model compression

被引:12
|
作者
Zhao, Haoran [1 ]
Sun, Xin [1 ,2 ]
Dong, Junyu [1 ]
Manic, Milos [3 ]
Zhou, Huiyu [4 ]
Yu, Hui [5 ]
机构
[1] Ocean Univ China, Coll Informat Sci & Engn, Qingdao, Peoples R China
[2] Tech Univ Munich, Dept Aerosp & Geodesy, Munich, Germany
[3] Virginia Commonwealth Univ, Coll Engn, Richmond, VA USA
[4] Univ Leicester, Sch Informat, Leicester, Leics, England
[5] Univ Portsmouth, Sch Creat Technol, Portsmouth, Hants, England
基金
中国国家自然科学基金;
关键词
Deep neural networks; Image classification; Model compression; Knowledge distillation; Data-free; KNOWLEDGE; NETWORK; RECOGNITION;
D O I
10.1007/s13042-021-01443-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge distillation has been widely used to produce portable and efficient neural networks which can be well applied on edge devices for computer vision tasks. However, almost all top-performing knowledge distillation methods need to access the original training data, which usually has a huge size and is often unavailable. To tackle this problem, we propose a novel data-free approach in this paper, named Dual Discriminator Adversarial Distillation (DDAD) to distill a neural network without the need of any training data or meta-data. To be specific, we use a generator to create samples through dual discriminator adversarial distillation, which mimics the original training data. The generator not only uses the pre-trained teacher's intrinsic statistics in existing batch normalization layers but also obtains the maximum discrepancy from the student model. Then the generated samples are used to train the compact student network under the supervision of the teacher. The proposed method obtains an efficient student network which closely approximates its teacher network, without using the original training data. Extensive experiments are conducted to demonstrate the effectiveness of the proposed approach on CIFAR, Caltech101 and ImageNet datasets for classification tasks. Moreover, we extend our method to semantic segmentation tasks on several public datasets such as CamVid, NYUv2, Cityscapes and VOC 2012. To the best of our knowledge, this is the first work on generative model based data-free knowledge distillation on large-scale datasets such as ImageNet, Cityscapes and VOC 2012. Experiments show that our method outperforms all baselines for data-free knowledge distillation.
引用
收藏
页码:1213 / 1230
页数:18
相关论文
共 50 条
  • [31] CDFKD-MFS: Collaborative Data-Free Knowledge Distillation via Multi-Level Feature Sharing
    Hao, Zhiwei
    Luo, Yong
    Wang, Zhi
    Hu, Han
    An, Jianping
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4262 - 4274
  • [32] AdaDFKD: Exploring adaptive inter-sample relationship in data-free knowledge distillation
    Li, Jingru
    Zhou, Sheng
    Li, Liangcheng
    Wang, Haishuai
    Bu, Jiajun
    Yu, Zhi
    NEURAL NETWORKS, 2024, 177
  • [33] Discriminator-Free Generative Adversarial Attack
    Lu, Shaohao
    Xian, Yuqiao
    Yan, Ke
    Hu, Yi
    Sun, Xing
    Guo, Xiaowei
    Huang, Feiyue
    Zheng, Wei-Shi
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1544 - 1552
  • [34] Up to Thousands-fold Storage Saving: Towards Efficient Data-Free Distillation of Large-Scale Visual Classifiers
    Ye, Fanfan
    Lu, Bingyi
    Ma, Liang
    Zhong, Qiaoyong
    Xie, Di
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8376 - 8386
  • [35] Data-Free Low-Bit Quantization via Dynamic Multi-teacher Knowledge Distillation
    Huang, Chong
    Lin, Shaohui
    Zhang, Yan
    Li, Ke
    Zhang, Baochang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII, 2024, 14432 : 28 - 41
  • [36] A Stable and Efficient Data-Free Model Attack With Label-Noise Data Generation
    Zhang, Zhixuan
    Zheng, Xingjian
    Qing, Linbo
    Liu, Qi
    Wang, Pingyu
    Liu, Yu
    Liao, Jiyang
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 3131 - 3145
  • [37] Customizing Synthetic Data for Data-Free Student Learning
    Luo, Shiya
    Chen, Defang
    Wang, Can
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1817 - 1822
  • [38] Discriminator-free adversarial domain adaptation with information balance
    Jiang, Hui
    Wu, Di
    Wei, Xing
    Jiang, Wenhao
    Qing, Xiongbo
    ELECTRONIC RESEARCH ARCHIVE, 2025, 33 (01): : 210 - 230
  • [39] FE-DaST: Fast and effective data-free substitute training for black-box adversarial attacks
    Yu, Mengran
    Sun, Shiliang
    COMPUTERS & SECURITY, 2022, 113
  • [40] A Dual Discriminator Adversarial Learning Approach for Dental Occlusal Surface Reconstruction
    Tian, Sukun
    Huang, Renkai
    Li, Zhenyang
    Fiorenza, Luca
    Dai, Ning
    Sun, Yuchun
    Ma, Haifeng
    JOURNAL OF HEALTHCARE ENGINEERING, 2022, 2022