Knowledge Distillation Meets Open-Set Semi-supervised Learning

被引:2
|
作者
Yang, Jing [1 ]
Zhu, Xiatian [2 ,3 ]
Bulat, Adrian [2 ]
Martinez, Brais [2 ]
Tzimiropoulos, Georgios [2 ,4 ]
机构
[1] Univ Nottingham, Nottingham, England
[2] Samsung AI Ctr, Cambridge, England
[3] Univ Surrey, Guildford, England
[4] Queen Mary Univ London, London, England
关键词
Knowledge distillation; Structured representational knowledge; Open-set semi-supervised learning; Out-of-distribution;
D O I
10.1007/s11263-024-02192-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing knowledge distillation methods mostly focus on distillation of teacher's prediction and intermediate activation. However, the structured representation, which arguably is one of the most critical ingredients of deep models, is largely overlooked. In this work, we propose a novel semantic representational distillation (SRD) method dedicated for distilling representational knowledge semantically from a pretrained teacher to a target student. The key idea is that we leverage the teacher's classifier as a semantic critic for evaluating the representations of both teacher and student and distilling the semantic knowledge with high-order structured information over all feature dimensions. This is accomplished by introducing a notion of cross-network logit computed through passing student's representation into teacher's classifier. Further, considering the set of seen classes as a basis for the semantic space in a combinatorial perspective, we scale SRD to unseen classes for enabling effective exploitation of largely available, arbitrary unlabeled training data. At the problem level, this establishes an interesting connection between knowledge distillation with open-set semi-supervised learning (SSL). Extensive experiments show that our SRD outperforms significantly previous state-of-the-art knowledge distillation methods on both coarse object classification and fine face recognition tasks, as well as less studied yet practically crucial binary network distillation. Under more realistic open-set SSL settings we introduce, we reveal that knowledge distillation is generally more effective than existing out-of-distribution sample detection, and our proposed SRD is superior over both previous distillation and SSL competitors. The source code is available at https://github.com/jingyang2017/SRD_ossl.
引用
收藏
页码:315 / 334
页数:20
相关论文
共 50 条
  • [41] Less-supervised learning with knowledge distillation for sperm morphology analysis
    Nabipour, Ali
    Nejati, Mohammad Javad Shams
    Boreshban, Yasaman
    Mirroshandel, Seyed Abolghasem
    COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING-IMAGING AND VISUALIZATION, 2024, 12 (01)
  • [42] Domain Knowledge Distillation and Supervised Contrastive Learning for Industrial Process Monitoring
    Ai, Mingxi
    Xie, Yongfang
    Ding, Steven X. X.
    Tang, Zhaohui
    Gui, Weihua
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (09) : 9452 - 9462
  • [43] Distillation-Based Semi-Supervised Federated Learning for Communication-Efficient Collaborative Training With Non-IID Private Data
    Itahara, Sohei
    Nishio, Takayuki
    Koda, Yusuke
    Morikura, Masahiro
    Yamamoto, Koji
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (01) : 191 - 205
  • [44] SimCVD: Simple Contrastive Voxel-Wise Representation Distillation for Semi-Supervised Medical Image Segmentation
    You, Chenyu
    Zhou, Yuan
    Zhao, Ruihan
    Staib, Lawrence
    Duncan, James S.
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2022, 41 (09) : 2228 - 2237
  • [45] Semi-Supervised Learning Based on Reference Model for Low-resource TTS
    Zhang, Xulong
    Wang, Jianzong
    Cheng, Ning
    Xiao, Jing
    2022 18TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN, 2022, : 966 - 971
  • [46] EFFICIENT LARGE SCALE SEMI-SUPERVISED LEARNING FOR CTC BASED ACOUSTIC MODELS
    Swarup, Prakhar
    Chakrabarty, Debmalya
    Sapru, Ashtosh
    Tulsiani, Hitesh
    Arsikere, Harish
    Garimella, Sri
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 148 - 155
  • [47] Learning from open-set noisy labels based on multi-prototype modeling
    Zhang, Yue
    Chen, Yiyi
    Fang, Chaowei
    Wang, Qian
    Wu, Jiayi
    Xin, Jingmin
    PATTERN RECOGNITION, 2025, 157
  • [48] Self-Supervised Contrastive Learning for Camera-to-Radar Knowledge Distillation
    Wang, Wenpeng
    Campbell, Bradford
    Munir, Sirajum
    2024 20TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SMART SYSTEMS AND THE INTERNET OF THINGS, DCOSS-IOT 2024, 2024, : 154 - 161
  • [49] Image quality assessment based on self-supervised learning and knowledge distillation
    Sang, Qingbing
    Shu, Ziru
    Liu, Lixiong
    Hu, Cong
    Wu, Qin
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 90
  • [50] Semi-supervised pathological image segmentation via cross distillation of multiple attentions and Seg-CAM consistency
    Zhong, Lanfeng
    Luo, Xiangde
    Liao, Xin
    Zhang, Shaoting
    Wang, Guotai
    PATTERN RECOGNITION, 2024, 152