Dual discriminator adversarial distillation for data-free model compression

被引:12
|
作者
Zhao, Haoran [1 ]
Sun, Xin [1 ,2 ]
Dong, Junyu [1 ]
Manic, Milos [3 ]
Zhou, Huiyu [4 ]
Yu, Hui [5 ]
机构
[1] Ocean Univ China, Coll Informat Sci & Engn, Qingdao, Peoples R China
[2] Tech Univ Munich, Dept Aerosp & Geodesy, Munich, Germany
[3] Virginia Commonwealth Univ, Coll Engn, Richmond, VA USA
[4] Univ Leicester, Sch Informat, Leicester, Leics, England
[5] Univ Portsmouth, Sch Creat Technol, Portsmouth, Hants, England
基金
中国国家自然科学基金;
关键词
Deep neural networks; Image classification; Model compression; Knowledge distillation; Data-free; KNOWLEDGE; NETWORK; RECOGNITION;
D O I
10.1007/s13042-021-01443-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge distillation has been widely used to produce portable and efficient neural networks which can be well applied on edge devices for computer vision tasks. However, almost all top-performing knowledge distillation methods need to access the original training data, which usually has a huge size and is often unavailable. To tackle this problem, we propose a novel data-free approach in this paper, named Dual Discriminator Adversarial Distillation (DDAD) to distill a neural network without the need of any training data or meta-data. To be specific, we use a generator to create samples through dual discriminator adversarial distillation, which mimics the original training data. The generator not only uses the pre-trained teacher's intrinsic statistics in existing batch normalization layers but also obtains the maximum discrepancy from the student model. Then the generated samples are used to train the compact student network under the supervision of the teacher. The proposed method obtains an efficient student network which closely approximates its teacher network, without using the original training data. Extensive experiments are conducted to demonstrate the effectiveness of the proposed approach on CIFAR, Caltech101 and ImageNet datasets for classification tasks. Moreover, we extend our method to semantic segmentation tasks on several public datasets such as CamVid, NYUv2, Cityscapes and VOC 2012. To the best of our knowledge, this is the first work on generative model based data-free knowledge distillation on large-scale datasets such as ImageNet, Cityscapes and VOC 2012. Experiments show that our method outperforms all baselines for data-free knowledge distillation.
引用
收藏
页码:1213 / 1230
页数:18
相关论文
共 50 条
  • [1] Dual discriminator adversarial distillation for data-free model compression
    Haoran Zhao
    Xin Sun
    Junyu Dong
    Milos Manic
    Huiyu Zhou
    Hui Yu
    International Journal of Machine Learning and Cybernetics, 2022, 13 : 1213 - 1230
  • [2] Dual-discriminator adversarial framework for data-free quantization
    Li, Zhikai
    Ma, Liping
    Long, Xianlei
    Xiao, Junrui
    Gu, Qingyi
    NEUROCOMPUTING, 2022, 511 : 67 - 77
  • [3] Data-Free Ensemble Knowledge Distillation for Privacy-conscious Multimedia Model Compression
    Hao, Zhiwei
    Luo, Yong
    Hu, Han
    An, Jianping
    Wen, Yonggang
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1803 - 1811
  • [4] Data-Free Network Quantization With Adversarial Knowledge Distillation
    Choi, Yoojin
    Choi, Jihwan
    El-Khamy, Mostafa
    Lee, Jungwon
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 3047 - 3057
  • [5] ENHANCING DATA-FREE ADVERSARIAL DISTILLATION WITH ACTIVATION REGULARIZATION AND VIRTUAL INTERPOLATION
    Qu, Xiaoyang
    Wang, Jianzong
    Xiao, Jing
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3340 - 3344
  • [6] DHBE: Data-free Holistic Backdoor Erasing in Deep Neural Networks via Restricted Adversarial Distillation
    Yan, Zhicong
    Li, Shenghong
    Zhao, Ruijie
    Tian, Yuan
    Zhao, Yuanyuan
    PROCEEDINGS OF THE 2023 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, ASIA CCS 2023, 2023, : 731 - 745
  • [7] Conditional generative data-free knowledge distillation
    Yu, Xinyi
    Yan, Ling
    Yang, Yang
    Zhou, Libo
    Ou, Linlin
    IMAGE AND VISION COMPUTING, 2023, 131
  • [8] Reusable generator data-free knowledge distillation with hard loss simulation for image classification
    Sun, Yafeng
    Wang, Xingwang
    Huang, Junhong
    Chen, Shilin
    Hou, Minghui
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 265
  • [9] Conditional pseudo-supervised contrast for data-Free knowledge distillation
    Shao, Renrong
    Zhang, Wei
    Wang, Jun
    PATTERN RECOGNITION, 2023, 143
  • [10] D3K: Dynastic Data-Free Knowledge Distillation
    Li, Xiufang
    Sun, Qigong
    Jiao, Licheng
    Liu, Fang
    Liu, Xu
    Li, Lingling
    Chen, Puhua
    Zuo, Yi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8358 - 8371