Logitwise Distillation Network: Improving Knowledge Distillation via Introducing Sample Confidence

被引：0

作者：

Shen, Teng ^{[1
]}

Cui, Zhenchao ^{[1
]}

Qi, Jing ^{[1
]}

机构：

[1] Hebei Univ, Hebei Prov Machine Vis Engn Res Ctr, Sch Cyber Secur & Comp, Baoding 071002, Peoples R China

来源：

APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 05期

关键词：

image classification; knowledge distillation; logit information; sample confidence;

D O I：

10.3390/app15052285

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

While existing knowledge distillation (KD) methods typically force students to mimic teacher features without considering prediction reliability, this practice risks propagating the teacher's erroneous supervision to the student. To address this, we propose the Logitwise Distillation Network (LDN), a novel framework that dynamically quantifies sample-wise confidence through the ranking of ground truth labels in teacher logits. Specifically, LDN introduces three key innovations: (1) weighted class means that prioritize high-confidence samples, (2) adaptive feature selection based on logit ranking, and (3) positive-negative sample adjustment (PNSA) to reverse error-prone supervision. These components are unified into a feature direction (FD) loss, which guides students to selectively emulate trustworthy teacher features. Experiments on CIFAR-100 and ImageNet demonstrate that LDN achieves state-of-the-art performance, improving accuracy by 0.3-0.5% over SOTA methods. Notably, LDN exhibits stronger compatibility with homogeneous networks (2.4% gain over baselines) and requires no additional training costs when integrated into existing KD pipelines. This work advances feature distillation by addressing error propagation, offering a plug-and-play solution for reliable knowledge transfer.

引用

页数：15

共 30 条

[1] Knowledge distillation: A good teacher is patient and consistent [J].

Beyer, Lucas ;

Zhai, Xiaohua ;

Royer, Amelie ;

Markeeva, Larisa ;

Anil, Rohan ;

Kolesnikov, Alexander .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :10915-10924

[2] Distilling Knowledge via Knowledge Review [J].

Chen, Pengguang ;

Liu, Shu ;

Zhao, Hengshuang ;

Jia, Jiaya .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :5006-5015

[3] On the Efficacy of Knowledge Distillation [J].

Cho, Jang Hyun ;

Hariharan, Bharath .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4793-4801

[4]

Furlanello Tommaso, 2018, INT C MACHINE LEARNI, P1607

[5] MulKD: Multi-layer Knowledge Distillation via collaborative learning [J].

Guermazi, Emna ;

Mdhaffar, Afef ;

Jmaiel, Mohamed ;

Freisleben, Bernd .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133

[6] Class Attention Transfer Based Knowledge Distillation [J].

Guo, Ziyao ;

Yan, Haonan ;

Li, Hui ;

Lin, Xiaodong .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :11868-11877

[7] Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1026-1034

[8] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[9]

Hinton G, 2015, Arxiv, DOI arXiv:1503.02531

[10]

Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/TPAMI.2019.2913372, 10.1109/CVPR.2018.00745]

← 1 2 3 →