Knowledge Distillation Meets Open-Set Semi-supervised Learning

被引：2

作者：

Yang, Jing ^{[1
]}

Zhu, Xiatian ^{[2
,3
]}

Bulat, Adrian ^{[2
]}

Martinez, Brais ^{[2
]}

Tzimiropoulos, Georgios ^{[2
,4
]}

机构：

[1] Univ Nottingham, Nottingham, England

[2] Samsung AI Ctr, Cambridge, England

[3] Univ Surrey, Guildford, England

[4] Queen Mary Univ London, London, England

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2025年 / 133卷 / 01期

关键词：

Knowledge distillation; Structured representational knowledge; Open-set semi-supervised learning; Out-of-distribution;

D O I：

10.1007/s11263-024-02192-7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Existing knowledge distillation methods mostly focus on distillation of teacher's prediction and intermediate activation. However, the structured representation, which arguably is one of the most critical ingredients of deep models, is largely overlooked. In this work, we propose a novel semantic representational distillation (SRD) method dedicated for distilling representational knowledge semantically from a pretrained teacher to a target student. The key idea is that we leverage the teacher's classifier as a semantic critic for evaluating the representations of both teacher and student and distilling the semantic knowledge with high-order structured information over all feature dimensions. This is accomplished by introducing a notion of cross-network logit computed through passing student's representation into teacher's classifier. Further, considering the set of seen classes as a basis for the semantic space in a combinatorial perspective, we scale SRD to unseen classes for enabling effective exploitation of largely available, arbitrary unlabeled training data. At the problem level, this establishes an interesting connection between knowledge distillation with open-set semi-supervised learning (SSL). Extensive experiments show that our SRD outperforms significantly previous state-of-the-art knowledge distillation methods on both coarse object classification and fine face recognition tasks, as well as less studied yet practically crucial binary network distillation. Under more realistic open-set SSL settings we introduce, we reveal that knowledge distillation is generally more effective than existing out-of-distribution sample detection, and our proposed SRD is superior over both previous distillation and SSL competitors. The source code is available at https://github.com/jingyang2017/SRD_ossl.

引用

页码：315 / 334

页数：20

共 50 条

[41] Less-supervised learning with knowledge distillation for sperm morphology analysis
Nabipour, Ali
Nejati, Mohammad Javad Shams
Boreshban, Yasaman
Mirroshandel, Seyed Abolghasem
COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING-IMAGING AND VISUALIZATION, 2024, 12 (01)
[42] Domain Knowledge Distillation and Supervised Contrastive Learning for Industrial Process Monitoring
Ai, Mingxi
Xie, Yongfang
Ding, Steven X. X.
Tang, Zhaohui
Gui, Weihua
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (09) : 9452 - 9462
[43] Distillation-Based Semi-Supervised Federated Learning for Communication-Efficient Collaborative Training With Non-IID Private Data
Itahara, Sohei
Nishio, Takayuki
Koda, Yusuke
Morikura, Masahiro
Yamamoto, Koji
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (01) : 191 - 205
[44] SimCVD: Simple Contrastive Voxel-Wise Representation Distillation for Semi-Supervised Medical Image Segmentation
You, Chenyu
Zhou, Yuan
Zhao, Ruihan
Staib, Lawrence
Duncan, James S.
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2022, 41 (09) : 2228 - 2237
[45] Semi-Supervised Learning Based on Reference Model for Low-resource TTS
Zhang, Xulong
Wang, Jianzong
Cheng, Ning
Xiao, Jing
2022 18TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN, 2022, : 966 - 971
[46] EFFICIENT LARGE SCALE SEMI-SUPERVISED LEARNING FOR CTC BASED ACOUSTIC MODELS
Swarup, Prakhar
Chakrabarty, Debmalya
Sapru, Ashtosh
Tulsiani, Hitesh
Arsikere, Harish
Garimella, Sri
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 148 - 155
[47] Learning from open-set noisy labels based on multi-prototype modeling
Zhang, Yue
Chen, Yiyi
Fang, Chaowei
Wang, Qian
Wu, Jiayi
Xin, Jingmin
PATTERN RECOGNITION, 2025, 157
[48] Self-Supervised Contrastive Learning for Camera-to-Radar Knowledge Distillation
Wang, Wenpeng
Campbell, Bradford
Munir, Sirajum
2024 20TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SMART SYSTEMS AND THE INTERNET OF THINGS, DCOSS-IOT 2024, 2024, : 154 - 161
[49] Image quality assessment based on self-supervised learning and knowledge distillation
Sang, Qingbing
Shu, Ziru
Liu, Lixiong
Hu, Cong
Wu, Qin
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 90
[50] Semi-supervised pathological image segmentation via cross distillation of multiple attentions and Seg-CAM consistency
Zhong, Lanfeng
Luo, Xiangde
Liao, Xin
Zhang, Shaoting
Wang, Guotai
PATTERN RECOGNITION, 2024, 152

← 1 2 3 4 5 →