Self-knowledge distillation enhanced binary neural networks derived from underutilized information

被引:1
|
作者
Zeng, Kai [1 ,2 ]
Wan, Zixin [1 ,2 ]
Gu, Hongwei [1 ,2 ]
Shen, Tao [1 ,2 ]
机构
[1] Kunming Univ Sci & Technol, Sch Informat Engn & Automat, Wujiaying St, Kunming 650031, Yunnan, Peoples R China
[2] Kunming Univ Sci & Technol, Yunnan Key Lab Comp Technol Applicat, Wujiaying St, Kunming 650031, Yunnan, Peoples R China
关键词
Binary neural networks; Self-knowledge distillation; Underutilized information; Binarization;
D O I
10.1007/s10489-024-05444-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Binarization efficiently compresses full-precision convolutional neural networks (CNNs) to achieve accelerated inference but with substantial performance degradations. Self-knowledge distillation (SKD) can significantly improve the performance of a network by inheriting its own advanced knowledge. However, SKD for binary neural networks (BNNs) remains underexplored because the binary characteristics of weak BNNs limit their ability to act as effective teachers, hindering their ability to learn as students. In this study, a novel SKD-BNN framework is proposed by using two pieces of underutilized information. Full-precision weights, which are applied for gradient transfer, concurrently distill the feature knowledge of the teacher with high-level semantics. A value-swapping strategy minimizes the knowledge capacity gap, while the channel-spatial difference distillation loss promotes feature transfer. Moreover, historical output predictions generate a concentrated soft-label bank, providing abundant intra- and inter-category similarity knowledge. Dynamic filtering ensures the correctness of the soft labels during training, and the label-cluster loss enhances the summarization ability of the soft-label bank within the same category. The developed methods excel in extensive experiments, achieving state-of-the-art accuracy of 93.0% on the CIFAR-10 dataset, which is equivalent to that of full-precision CNNs. On the ImageNet dataset, the accuracy improves by 1.6% with the widely adopted IR-Net. It is emphasized that for the first time, the proposed method fully explores the underutilized information contained in BNNs and conducts an effective SKD process, enabling weak BNNs to serve as competent self-teachers and proficient students.
引用
收藏
页码:4994 / 5014
页数:21
相关论文
共 50 条
  • [1] Self-knowledge Distillation Enhanced Universal Framework for Physics-Informed Neural Networks
    Li, Ying
    Yang, Jiawei
    Wang, Dong
    NONLINEAR DYNAMICS, 2025, : 14143 - 14163
  • [2] Stochastic Precision Ensemble: Self-Knowledge Distillation for Quantized Deep Neural Networks
    Boo, Yoonho
    Shin, Sungho
    Choi, Jungwook
    Sung, Wonyong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 6794 - 6802
  • [3] Neighbor self-knowledge distillation
    Liang, Peng
    Zhang, Weiwei
    Wang, Junhuang
    Guo, Yufeng
    INFORMATION SCIENCES, 2024, 654
  • [4] Self-knowledge distillation with dimensional history knowledge
    Wenke Huang
    Mang Ye
    Zekun Shi
    He Li
    Bo Du
    Science China Information Sciences, 2025, 68 (9)
  • [5] Self-knowledge distillation via dropout
    Lee, Hyoje
    Park, Yeachan
    Seo, Hyun
    Kang, Myungjoo
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 233
  • [6] Enhanced ProtoNet With Self-Knowledge Distillation for Few-Shot Learning
    Habib, Mohamed El Hacen
    Kucukmanisa, Ayhan
    Urhan, Oguzhan
    IEEE ACCESS, 2024, 12 : 145331 - 145340
  • [7] Dual teachers for self-knowledge distillation
    Li, Zheng
    Li, Xiang
    Yang, Lingfeng
    Song, Renjie
    Yang, Jian
    Pan, Zhigeng
    PATTERN RECOGNITION, 2024, 151
  • [8] MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition
    Yang, Chuanguang
    An, Zhulin
    Zhou, Helong
    Cai, Linhang
    Zhi, Xiang
    Wu, Jiwen
    Xu, Yongjun
    Zhang, Qian
    COMPUTER VISION, ECCV 2022, PT XXIV, 2022, 13684 : 534 - 551
  • [9] Sliding Cross Entropy for Self-Knowledge Distillation
    Lee, Hanbeen
    Kim, Jeongho
    Woo, Simon S.
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 1044 - 1053
  • [10] Self-Knowledge Distillation with Progressive Refinement of Targets
    Kim, Kyungyul
    Ji, ByeongMoon
    Yoon, Doyoung
    Hwang, Sangheum
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6547 - 6556