Dual-Level Knowledge Distillation via Knowledge Alignment and Correlation

被引:7
|
作者
Ding, Fei [1 ]
Yang, Yin [1 ]
Hu, Hongxin [2 ]
Krovi, Venkat [3 ,4 ]
Luo, Feng [1 ]
机构
[1] Clemson Univ, Sch Comp, Clemson, SC 29634 USA
[2] Buffalo State Univ New York, Dept Comp Sci & Engn, Buffalo, NY 14260 USA
[3] Clemson Univ, Dept Automot Engn, Clemson, SC 29634 USA
[4] Clemson Univ, Dept Mech Engn, Clemson, SC 29634 USA
基金
美国国家科学基金会;
关键词
Correlation; Knowledge engineering; Task analysis; Standards; Network architecture; Prototypes; Training; Convolutional neural networks; dual-level knowledge; knowledge distillation (KD); representation learning; teacher-student model;
D O I
10.1109/TNNLS.2022.3190166
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge distillation (KD) has become a widely used technique for model compression and knowledge transfer. We find that the standard KD method performs the knowledge alignment on an individual sample indirectly via class prototypes and neglects the structural knowledge between different samples, namely, knowledge correlation. Although recent contrastive learning-based distillation methods can be decomposed into knowledge alignment and correlation, their correlation objectives undesirably push apart representations of samples from the same class, leading to inferior distillation results. To improve the distillation performance, in this work, we propose a novel knowledge correlation objective and introduce the dual-level knowledge distillation (DLKD), which explicitly combines knowledge alignment and correlation together instead of using one single contrastive objective. We show that both knowledge alignment and correlation are necessary to improve the distillation performance. In particular, knowledge correlation can serve as an effective regularization to learn generalized representations. The proposed DLKD is task-agnostic and model-agnostic, and enables effective knowledge transfer from supervised or self-supervised pretrained teachers to students. Experiments show that DLKD outperforms other state-of-the-art methods on a large number of experimental settings including: 1) pretraining strategies; 2) network architectures; 3) datasets; and 4) tasks.
引用
收藏
页码:2425 / 2435
页数:11
相关论文
共 50 条
  • [1] A Teacher-Free Graph Knowledge Distillation Framework With Dual Self-Distillation
    Wu, Lirong
    Lin, Haitao
    Gao, Zhangyang
    Zhao, Guojiang
    Li, Stan Z.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (09) : 4375 - 4385
  • [2] Dual Teacher Knowledge Distillation With Domain Alignment for Face Anti-Spoofing
    Kong, Zhe
    Zhang, Wentian
    Wang, Tao
    Zhang, Kaihao
    Li, Yuexiang
    Tang, Xiaoying
    Luo, Wenhan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 13177 - 13189
  • [3] Efficient Crowd Counting via Dual Knowledge Distillation
    Wang, Rui
    Hao, Yixue
    Hu, Long
    Li, Xianzhi
    Chen, Min
    Miao, Yiming
    Humar, Iztok
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 569 - 583
  • [4] A Virtual Knowledge Distillation via Conditional GAN
    Kim, Sihwan
    IEEE ACCESS, 2022, 10 : 34766 - 34778
  • [5] Improving Knowledge Distillation via Head and Tail Categories
    Xu, Liuchi
    Ren, Jin
    Huang, Zhenhua
    Zheng, Weishi
    Chen, Yunwen
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3465 - 3480
  • [6] Collaborative Knowledge Distillation
    Zhang, Weiwei
    Guo, Yufeng
    Wang, Junhuang
    Zhu, Jianqing
    Zeng, Huanqiang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 7601 - 7613
  • [7] DANE: A Dual-Level Alignment Network With Ensemble Learning for Multisource Domain Adaptation
    Yang, Yuxiang
    Wen, Lu
    Zeng, Pinxian
    Yan, Binyu
    Wang, Yan
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 11
  • [8] Highlight Every Step: Knowledge Distillation via Collaborative Teaching
    Zhao, Haoran
    Sun, Xin
    Dong, Junyu
    Chen, Changrui
    Dong, Zihe
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (04) : 2070 - 2081
  • [9] Visual Grounding With Dual Knowledge Distillation
    Wu, Wansen
    Cao, Meng
    Hu, Yue
    Peng, Yong
    Qin, Long
    Yin, Quanjun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 10399 - 10410
  • [10] Unsupervised Visual Representation Learning via Dual-Level Progressive Similar Instance Selection
    Fan, Hehe
    Liu, Ping
    Xu, Mingliang
    Yang, Yi
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (09) : 8851 - 8861