Self-knowledge distillation enhanced binary neural networks derived from underutilized information

被引：1

作者：

Zeng, Kai ^{[1
,2
]}

Wan, Zixin ^{[1
,2
]}

Gu, Hongwei ^{[1
,2
]}

Shen, Tao ^{[1
,2
]}

机构：

[1] Kunming Univ Sci & Technol, Sch Informat Engn & Automat, Wujiaying St, Kunming 650031, Yunnan, Peoples R China

[2] Kunming Univ Sci & Technol, Yunnan Key Lab Comp Technol Applicat, Wujiaying St, Kunming 650031, Yunnan, Peoples R China

来源：

APPLIED INTELLIGENCE | 2024年 / 54卷 / 06期

关键词：

Binary neural networks; Self-knowledge distillation; Underutilized information; Binarization;

D O I：

10.1007/s10489-024-05444-8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Binarization efficiently compresses full-precision convolutional neural networks (CNNs) to achieve accelerated inference but with substantial performance degradations. Self-knowledge distillation (SKD) can significantly improve the performance of a network by inheriting its own advanced knowledge. However, SKD for binary neural networks (BNNs) remains underexplored because the binary characteristics of weak BNNs limit their ability to act as effective teachers, hindering their ability to learn as students. In this study, a novel SKD-BNN framework is proposed by using two pieces of underutilized information. Full-precision weights, which are applied for gradient transfer, concurrently distill the feature knowledge of the teacher with high-level semantics. A value-swapping strategy minimizes the knowledge capacity gap, while the channel-spatial difference distillation loss promotes feature transfer. Moreover, historical output predictions generate a concentrated soft-label bank, providing abundant intra- and inter-category similarity knowledge. Dynamic filtering ensures the correctness of the soft labels during training, and the label-cluster loss enhances the summarization ability of the soft-label bank within the same category. The developed methods excel in extensive experiments, achieving state-of-the-art accuracy of 93.0% on the CIFAR-10 dataset, which is equivalent to that of full-precision CNNs. On the ImageNet dataset, the accuracy improves by 1.6% with the widely adopted IR-Net. It is emphasized that for the first time, the proposed method fully explores the underutilized information contained in BNNs and conducts an effective SKD process, enabling weak BNNs to serve as competent self-teachers and proficient students.

引用

页码：4994 / 5014

页数：21

共 50 条

[1] Self-knowledge Distillation Enhanced Universal Framework for Physics-Informed Neural Networks
Li, Ying
Yang, Jiawei
Wang, Dong
NONLINEAR DYNAMICS, 2025, : 14143 - 14163
[2] Stochastic Precision Ensemble: Self-Knowledge Distillation for Quantized Deep Neural Networks
Boo, Yoonho
Shin, Sungho
Choi, Jungwook
Sung, Wonyong
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 6794 - 6802
[3] Neighbor self-knowledge distillation
Liang, Peng
Zhang, Weiwei
Wang, Junhuang
Guo, Yufeng
INFORMATION SCIENCES, 2024, 654
[4] Self-knowledge distillation with dimensional history knowledge
Wenke Huang
Mang Ye
Zekun Shi
He Li
Bo Du
Science China Information Sciences, 2025, 68 (9)
[5] Self-knowledge distillation via dropout
Lee, Hyoje
Park, Yeachan
Seo, Hyun
Kang, Myungjoo
COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 233
[6] Enhanced ProtoNet With Self-Knowledge Distillation for Few-Shot Learning
Habib, Mohamed El Hacen
Kucukmanisa, Ayhan
Urhan, Oguzhan
IEEE ACCESS, 2024, 12 : 145331 - 145340
[7] Dual teachers for self-knowledge distillation
Li, Zheng
Li, Xiang
Yang, Lingfeng
Song, Renjie
Yang, Jian
Pan, Zhigeng
PATTERN RECOGNITION, 2024, 151
[8] MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition
Yang, Chuanguang
An, Zhulin
Zhou, Helong
Cai, Linhang
Zhi, Xiang
Wu, Jiwen
Xu, Yongjun
Zhang, Qian
COMPUTER VISION, ECCV 2022, PT XXIV, 2022, 13684 : 534 - 551
[9] Sliding Cross Entropy for Self-Knowledge Distillation
Lee, Hanbeen
Kim, Jeongho
Woo, Simon S.
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 1044 - 1053
[10] Self-Knowledge Distillation with Progressive Refinement of Targets
Kim, Kyungyul
Ji, ByeongMoon
Yoon, Doyoung
Hwang, Sangheum
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6547 - 6556

← 1 2 3 4 5 →