Model Selection - Knowledge Distillation Framework for Model Compression

被引：0

作者：

Chen, Renhai ^{[1
]}

Yuan, Shimin ^{[1
]}

Wang, Shaobo ^{[1
]}

Li, Zhenghan ^{[1
]}

Xing, Meng ^{[1
]}

Feng, Zhiyong ^{[1
]}

机构：

[1] Tianjin Univ, Shenzhen Res Inst, Coll Intelligence & Comp, Tianjin, Peoples R China

来源：

2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021) | 2021年

基金：

中国国家自然科学基金;

关键词：

model selection; model compression; knowledge distillation;

D O I：

10.1109/SSCI50451.2021.9659861

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The significant increase in the computation and parameter storage costs of CNNs promotes its development in various applications and restricts its deployment in edge devices as well. Therefore, many neural network pruning methods has been proposed for neural network compression and acceleration. However, there are two major limitations to these methods: First, prevailing methods usually design single pruning criteria for the primitive network and fail to consider the diversity of potential optimal sub-network structure. Second, these methods utilize traditional training method to train the sub-network, which is not enough to develop the expression ability of the sub-network under the current task.In this paper, we propose Model Selection - Knowledge Distillation (MS-KD) framework to solve the above problems. Specifically, we develop multiple pruning criteria for the primitive network, and the potential optimal structure is obtained through model selection.Furthermore, instead of traditional training methods, we use knowledge distillation to train the learned sub-network and make full use of the structure advantages of the sub-network.To validate our approach, we conduct extensive experiments on prevalent image classification datasets.The results demonstrate that our MS-KD framework outperforms the existing methods under a wide range of data sets, models, and inference costs.

引用

页数：6

共 50 条

[21] Model compression via pruning and knowledge distillation for person re-identification
Haonan Xie
Wei Jiang
Hao Luo
Hongyan Yu
Journal of Ambient Intelligence and Humanized Computing, 2021, 12 : 2149 - 2161
[22] AUGMENTING KNOWLEDGE DISTILLATION WITH PEER-TO-PEER MUTUAL LEARNING FOR MODEL COMPRESSION
Niyaz, Usma
Bathula, Deepti R.
2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
[23] End-to-end model compression via pruning and knowledge distillation for lightweight image super resolution
Yanzhe Wang
Yizhen Wang
Avinash Rohra
Baoqun Yin
Pattern Analysis and Applications, 2025, 28 (2)
[24] Attention-Fused CNN Model Compression with Knowledge Distillation for Brain Tumor Segmentation
Xu, Pengcheng
Kim, Kyungsang
Liu, Huafeng
Li, Quanzheng
MEDICAL IMAGE UNDERSTANDING AND ANALYSIS, MIUA 2022, 2022, 13413 : 328 - 338
[25] The Optimization Method of Knowledge Distillation Based on Model Pruning
Wu, Min
Ma, Weihua
Li, Yue
Zhao, Xiongbo
2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 1386 - 1390
[26] Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System
Yang, Ze
Shou, Linjun
Gong, Ming
Lin, Wutao
Jiang, Daxin
PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 690 - 698
[27] Research on the Construction of an Efficient and Lightweight Online Detection Method for Tiny Surface Defects through Model Compression and Knowledge Distillation
Chen, Qipeng
Xiong, Qiaoqiao
Huang, Haisong
Tang, Saihong
Liu, Zhenghong
ELECTRONICS, 2024, 13 (02)
[28] A hybrid model compression approach via knowledge distillation for predicting energy consumption in additive manufacturing
Li, Yixin
Hu, Fu
Liu, Ying
Ryan, Michael
Wang, Ray
INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2023, 61 (13) : 4525 - 4547
[29] On-Demand Deep Model Compression for Mobile Devices: A Usage-Driven Model Selection Framework
Liu, Sicong
Lin, Yingyan
Zhou, Zimu
Nan, Kaiming
Liu, Hui
Du, Junzhao
MOBISYS'18: PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON MOBILE SYSTEMS, APPLICATIONS, AND SERVICES, 2018, : 389 - 400
[30] Contrastive adversarial knowledge distillation for deep model compression in time-series regression tasks
Xu, Qing
Chen, Zhenghua
Ragab, Mohamed
Wang, Chao
Wu, Min
Li, Xiaoli
NEUROCOMPUTING, 2022, 485 : 242 - 251

← 1 2 3 4 5 →