PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation

被引：9

作者：

Kim, Jangho ^{[1
,2
]}

Chang, Simyung ^{[1
]}

Kwak, Nojun ^{[2
]}

机构：

[1] Qualcomm Korea YH, Qualcomm AI Res, Seoul, South Korea

[2] Seoul Natl Univ, Seoul, South Korea

来源：

INTERSPEECH 2021 | 2021年

基金：

新加坡国家研究基金会;

关键词：

keyword spotting; model pruning; model quantization; knowledge distillation;

D O I：

10.21437/Interspeech.2021-248

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

As edge devices become prevalent, deploying Deep Neural Networks (DNN) on edge devices has become a critical issue. However, DNN requires a high computational resource which is rarely available for edge devices. To handle this, we propose a novel model compression method for the devices with limited computational resources, called PQK consisting of pruning, quantization, and knowledge distillation (KD) processes. Unlike traditional pruning and KD, PQK makes use of unimportant weights pruned in the pruning process to make a teacher network for training a better student network without pre-training the teacher model. PQK has two phases. Phase 1 exploits iterative pruning and quantization-aware training to make a lightweight and power-efficient model. In phase 2, we make a teacher network by adding unimportant weights unused in phase 1 to a pruned network. By using this teacher network, we train the pruned network as a student network. In doing so, we do not need a pre-trained teacher network for the KD framework because the teacher and the student networks coexist within the same network (See Fig. 1). We apply our method to the recognition model and verify the effectiveness of PQK on keyword spotting (KWS) and image recognition.

引用

页码：4568 / 4572

页数：5

共 50 条

[1] Compression of Acoustic Model via Knowledge Distillation and Pruning
Li, Chenxing
Zhu, Lei
Xu, Shuang
Gao, Peng
Xu, Bo
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2785 - 2790
[2] Model compression via pruning and knowledge distillation for person re-identification
Xie, Haonan
Jiang, Wei
Luo, Hao
Yu, Hongyan
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (02) : 2149 - 2161
[3] Quantization Robust Pruning With Knowledge Distillation
Kim, Jangho
IEEE ACCESS, 2023, 11 : 26419 - 26426
[4] Model compression via pruning and knowledge distillation for person re-identification
Haonan Xie
Wei Jiang
Hao Luo
Hongyan Yu
Journal of Ambient Intelligence and Humanized Computing, 2021, 12 : 2149 - 2161
[5] Efficient and Controllable Model Compression through Sequential Knowledge Distillation and Pruning
Malihi, Leila
Heidemann, Gunther
BIG DATA AND COGNITIVE COMPUTING, 2023, 7 (03)
[6] End-to-end model compression via pruning and knowledge distillation for lightweight image super resolution
Yanzhe Wang
Yizhen Wang
Avinash Rohra
Baoqun Yin
Pattern Analysis and Applications, 2025, 28 (2)
[7] Joint structured pruning and dense knowledge distillation for efficient transformer model compression
Cui, Baiyun
Li, Yingming
Zhang, Zhongfei
NEUROCOMPUTING, 2021, 458 : 56 - 69
[8] Model Compression by Iterative Pruning with Knowledge Distillation and Its Application to Speech Enhancement
Wei, Zeyuan
Li, Hao
Zhang, Xueliang
INTERSPEECH 2022, 2022, : 941 - 945
[9] The Optimization Method of Knowledge Distillation Based on Model Pruning
Wu, Min
Ma, Weihua
Li, Yue
Zhao, Xiongbo
2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 1386 - 1390
[10] Matching the Ideal Pruning Method with Knowledge Distillation for Optimal Compression
Malihi, Leila
Heidemann, Gunther
APPLIED SYSTEM INNOVATION, 2024, 7 (04)

← 1 2 3 4 5 →