Knowledge Distillation Meets Open-Set Semi-supervised Learning

被引:2
|
作者
Yang, Jing [1 ]
Zhu, Xiatian [2 ,3 ]
Bulat, Adrian [2 ]
Martinez, Brais [2 ]
Tzimiropoulos, Georgios [2 ,4 ]
机构
[1] Univ Nottingham, Nottingham, England
[2] Samsung AI Ctr, Cambridge, England
[3] Univ Surrey, Guildford, England
[4] Queen Mary Univ London, London, England
关键词
Knowledge distillation; Structured representational knowledge; Open-set semi-supervised learning; Out-of-distribution;
D O I
10.1007/s11263-024-02192-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing knowledge distillation methods mostly focus on distillation of teacher's prediction and intermediate activation. However, the structured representation, which arguably is one of the most critical ingredients of deep models, is largely overlooked. In this work, we propose a novel semantic representational distillation (SRD) method dedicated for distilling representational knowledge semantically from a pretrained teacher to a target student. The key idea is that we leverage the teacher's classifier as a semantic critic for evaluating the representations of both teacher and student and distilling the semantic knowledge with high-order structured information over all feature dimensions. This is accomplished by introducing a notion of cross-network logit computed through passing student's representation into teacher's classifier. Further, considering the set of seen classes as a basis for the semantic space in a combinatorial perspective, we scale SRD to unseen classes for enabling effective exploitation of largely available, arbitrary unlabeled training data. At the problem level, this establishes an interesting connection between knowledge distillation with open-set semi-supervised learning (SSL). Extensive experiments show that our SRD outperforms significantly previous state-of-the-art knowledge distillation methods on both coarse object classification and fine face recognition tasks, as well as less studied yet practically crucial binary network distillation. Under more realistic open-set SSL settings we introduce, we reveal that knowledge distillation is generally more effective than existing out-of-distribution sample detection, and our proposed SRD is superior over both previous distillation and SSL competitors. The source code is available at https://github.com/jingyang2017/SRD_ossl.
引用
收藏
页码:315 / 334
页数:20
相关论文
共 50 条
  • [21] Knowledge Distillation from Cross Teaching Teachers for Efficient Semi-supervised Abdominal Organ Segmentation in CT
    Choi, Jae Won
    FAST AND LOW-RESOURCE SEMI-SUPERVISED ABDOMINAL ORGAN SEGMENTATION, FLARE 2022, 2022, 13816 : 101 - 115
  • [22] Bridging the Task Barriers: Online Knowledge Distillation Across Tasks for Semi-supervised Mediastinal Segmentation in CT
    Chaudhary, Muhammad F. A.
    Hosseini, Seyed Soheil
    Barr, R. Graham
    Reinhardt, Joseph M.
    Hoffman, Eric A.
    Gerard, Sarah E.
    MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2023, PT I, 2024, 14348 : 310 - 319
  • [23] Uncertainty-Aware Distillation for Semi-Supervised Few-Shot Class-Incremental Learning
    Cui, Yawen
    Deng, Wanxia
    Chen, Haoyu
    Liu, Li
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 14259 - 14272
  • [24] Knowledge Distillation with Adaptive Asymmetric Label Sharpening for Semi-supervised Fracture Detection in Chest X-Rays
    Wang, Yirui
    Zheng, Kang
    Cheng, Chi-Tung
    Zhou, Xiao-Yun
    Zheng, Zhilin
    Xiao, Jing
    Lu, Le
    Liao, Chien-Hung
    Miao, Shun
    INFORMATION PROCESSING IN MEDICAL IMAGING, IPMI 2021, 2021, 12729 : 599 - 610
  • [25] Open-set learning context recognizing in mobile learning: Problem and methodology
    Li, Jin
    Wang, Jingxin
    Guo, Longjiang
    Ren, Meirui
    Hao, Fei
    ICT EXPRESS, 2024, 10 (04): : 909 - 915
  • [26] Graph Ensemble Networks for Semi-supervised Embedding Learning
    Tang, Hui
    Liang, Xun
    Wu, Bo
    Guan, Zhenyu
    Guo, Yuhui
    Zheng, Xiangping
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, 2021, 12815 : 408 - 420
  • [27] Distilling interaction knowledge for semi-supervised egocentric action recognition
    Wang, Haoran
    Yang, Jiahao
    Yu, Baosheng
    Zhan, Yibing
    Tao, Dapeng
    Ling, Haibin
    PATTERN RECOGNITION, 2025, 157
  • [28] Melanoma Breslow Thickness Classification Using Ensemble-Based Knowledge Distillation With Semi-Supervised Convolutional Neural Networks
    Dominguez-Morales, Juan P.
    Hernandez-Rodriguez, Juan-Carlos
    Duran-Lopez, Lourdes
    Conejo-Mir, Julian
    Pereyra-Rodriguez, Jose-Juan
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2025, 29 (01) : 443 - 455
  • [29] A Semi-Supervised Method for Grain Boundary Segmentation: Teacher-Student Knowledge Distillation and Pseudo-Label Repair
    Huang, Yuanyou
    Zhang, Xiaoxun
    Ma, Fang
    Li, Jiaming
    Wang, Shuxian
    ELECTRONICS, 2024, 13 (17)
  • [30] Semi-supervised Pathological Image Segmentation via Cross Distillation of Multiple Attentions
    Zhong, Lanfeng
    Liao, Xin
    Zhang, Shaoting
    Wang, Guotai
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VI, 2023, 14225 : 570 - 579