Model Selection - Knowledge Distillation Framework for Model Compression

被引:0
|
作者
Chen, Renhai [1 ]
Yuan, Shimin [1 ]
Wang, Shaobo [1 ]
Li, Zhenghan [1 ]
Xing, Meng [1 ]
Feng, Zhiyong [1 ]
机构
[1] Tianjin Univ, Shenzhen Res Inst, Coll Intelligence & Comp, Tianjin, Peoples R China
来源
2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021) | 2021年
基金
中国国家自然科学基金;
关键词
model selection; model compression; knowledge distillation;
D O I
10.1109/SSCI50451.2021.9659861
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The significant increase in the computation and parameter storage costs of CNNs promotes its development in various applications and restricts its deployment in edge devices as well. Therefore, many neural network pruning methods has been proposed for neural network compression and acceleration. However, there are two major limitations to these methods: First, prevailing methods usually design single pruning criteria for the primitive network and fail to consider the diversity of potential optimal sub-network structure. Second, these methods utilize traditional training method to train the sub-network, which is not enough to develop the expression ability of the sub-network under the current task.In this paper, we propose Model Selection - Knowledge Distillation (MS-KD) framework to solve the above problems. Specifically, we develop multiple pruning criteria for the primitive network, and the potential optimal structure is obtained through model selection.Furthermore, instead of traditional training methods, we use knowledge distillation to train the learned sub-network and make full use of the structure advantages of the sub-network.To validate our approach, we conduct extensive experiments on prevalent image classification datasets.The results demonstrate that our MS-KD framework outperforms the existing methods under a wide range of data sets, models, and inference costs.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] LAD: Layer-Wise Adaptive Distillation for BERT Model Compression
    Lin, Ying-Jia
    Chen, Kuan-Yu
    Kao, Hung-Yu
    SENSORS, 2023, 23 (03)
  • [32] Dual discriminator adversarial distillation for data-free model compression
    Zhao, Haoran
    Sun, Xin
    Dong, Junyu
    Manic, Milos
    Zhou, Huiyu
    Yu, Hui
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2022, 13 (05) : 1213 - 1230
  • [33] Dual discriminator adversarial distillation for data-free model compression
    Haoran Zhao
    Xin Sun
    Junyu Dong
    Milos Manic
    Huiyu Zhou
    Hui Yu
    International Journal of Machine Learning and Cybernetics, 2022, 13 : 1213 - 1230
  • [34] Knowledge distillation for object detection with diffusion model
    Zhang, Yi
    Long, Junzong
    Li, Chunrui
    NEUROCOMPUTING, 2025, 636
  • [35] AdaDS: Adaptive data selection for accelerating pre-trained language model knowledge distillation
    Zhou, Qinhong
    Li, Peng
    Liu, Yang
    Guan, Yuyang
    Xing, Qizhou
    Chen, Ming
    Sun, Maosong
    Liu, Yang
    AI OPEN, 2023, 4 : 56 - 63
  • [36] A Task-Efficient Gradient Guide Knowledge Distillation for Pre-train Language Model Compression
    Liu, Xu
    Su, Yila
    Wu, Nier
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14877 : 366 - 377
  • [37] Semantic Segmentation Optimization Algorithm Based on Knowledge Distillation and Model Pruning
    Yao, Weiwei
    Zhang, Jie
    Li, Chen
    Li, Shiyun
    He, Li
    Zhang, Bo
    2019 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2019), 2019, : 261 - 265
  • [38] Effective Compression of Language Models by Combining Pruning and Knowledge Distillation
    Chiu, Chi-Yu
    Hong, Ding-Yong
    Liu, Pangfeng
    Wu, Jan-Jan
    2024 IEEE 48TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC 2024, 2024, : 429 - 438
  • [39] A Unified Asymmetric Knowledge Distillation Framework for Image Classification
    Ye, Xin
    Tian, Xiang
    Zheng, Bolun
    Zhou, Fan
    Chen, Yaowu
    NEURAL PROCESSING LETTERS, 2024, 56 (04)
  • [40] Compressing the Multiobject Tracking Model via Knowledge Distillation
    Liang, Tianyi
    Wang, Mengzhu
    Chen, Junyang
    Chen, Dingyao
    Luo, Zhigang
    Leung, Victor C. M.
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (02) : 2713 - 2723