Dual discriminator adversarial distillation for data-free model compression

被引：12

作者：

Zhao, Haoran ^{[1
]}

Sun, Xin ^{[1
,2
]}

Dong, Junyu ^{[1
]}

Manic, Milos ^{[3
]}

Zhou, Huiyu ^{[4
]}

Yu, Hui ^{[5
]}

机构：

[1] Ocean Univ China, Coll Informat Sci & Engn, Qingdao, Peoples R China

[2] Tech Univ Munich, Dept Aerosp & Geodesy, Munich, Germany

[3] Virginia Commonwealth Univ, Coll Engn, Richmond, VA USA

[4] Univ Leicester, Sch Informat, Leicester, Leics, England

[5] Univ Portsmouth, Sch Creat Technol, Portsmouth, Hants, England

来源：

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS | 2022年 / 13卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Deep neural networks; Image classification; Model compression; Knowledge distillation; Data-free; KNOWLEDGE; NETWORK; RECOGNITION;

D O I：

10.1007/s13042-021-01443-0

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Knowledge distillation has been widely used to produce portable and efficient neural networks which can be well applied on edge devices for computer vision tasks. However, almost all top-performing knowledge distillation methods need to access the original training data, which usually has a huge size and is often unavailable. To tackle this problem, we propose a novel data-free approach in this paper, named Dual Discriminator Adversarial Distillation (DDAD) to distill a neural network without the need of any training data or meta-data. To be specific, we use a generator to create samples through dual discriminator adversarial distillation, which mimics the original training data. The generator not only uses the pre-trained teacher's intrinsic statistics in existing batch normalization layers but also obtains the maximum discrepancy from the student model. Then the generated samples are used to train the compact student network under the supervision of the teacher. The proposed method obtains an efficient student network which closely approximates its teacher network, without using the original training data. Extensive experiments are conducted to demonstrate the effectiveness of the proposed approach on CIFAR, Caltech101 and ImageNet datasets for classification tasks. Moreover, we extend our method to semantic segmentation tasks on several public datasets such as CamVid, NYUv2, Cityscapes and VOC 2012. To the best of our knowledge, this is the first work on generative model based data-free knowledge distillation on large-scale datasets such as ImageNet, Cityscapes and VOC 2012. Experiments show that our method outperforms all baselines for data-free knowledge distillation.

引用

页码：1213 / 1230

页数：18

共 50 条

[1] Dual discriminator adversarial distillation for data-free model compression
Haoran Zhao
Xin Sun
Junyu Dong
Milos Manic
Huiyu Zhou
Hui Yu
International Journal of Machine Learning and Cybernetics, 2022, 13 : 1213 - 1230
[2] Dual-discriminator adversarial framework for data-free quantization
Li, Zhikai
Ma, Liping
Long, Xianlei
Xiao, Junrui
Gu, Qingyi
NEUROCOMPUTING, 2022, 511 : 67 - 77
[3] Data-Free Ensemble Knowledge Distillation for Privacy-conscious Multimedia Model Compression
Hao, Zhiwei
Luo, Yong
Hu, Han
An, Jianping
Wen, Yonggang
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1803 - 1811
[4] Data-Free Network Quantization With Adversarial Knowledge Distillation
Choi, Yoojin
Choi, Jihwan
El-Khamy, Mostafa
Lee, Jungwon
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 3047 - 3057
[5] ENHANCING DATA-FREE ADVERSARIAL DISTILLATION WITH ACTIVATION REGULARIZATION AND VIRTUAL INTERPOLATION
Qu, Xiaoyang
Wang, Jianzong
Xiao, Jing
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3340 - 3344
[6] DHBE: Data-free Holistic Backdoor Erasing in Deep Neural Networks via Restricted Adversarial Distillation
Yan, Zhicong
Li, Shenghong
Zhao, Ruijie
Tian, Yuan
Zhao, Yuanyuan
PROCEEDINGS OF THE 2023 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, ASIA CCS 2023, 2023, : 731 - 745
[7] Conditional generative data-free knowledge distillation
Yu, Xinyi
Yan, Ling
Yang, Yang
Zhou, Libo
Ou, Linlin
IMAGE AND VISION COMPUTING, 2023, 131
[8] Reusable generator data-free knowledge distillation with hard loss simulation for image classification
Sun, Yafeng
Wang, Xingwang
Huang, Junhong
Chen, Shilin
Hou, Minghui
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 265
[9] Conditional pseudo-supervised contrast for data-Free knowledge distillation
Shao, Renrong
Zhang, Wei
Wang, Jun
PATTERN RECOGNITION, 2023, 143
[10] D3K: Dynastic Data-Free Knowledge Distillation
Li, Xiufang
Sun, Qigong
Jiao, Licheng
Liu, Fang
Liu, Xu
Li, Lingling
Chen, Puhua
Zuo, Yi
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8358 - 8371

← 1 2 3 4 5 →