Virtual Student Distribution Knowledge Distillation for Long-Tailed Recognition

被引:0
作者
Liu, Haodong [1 ]
Huang, Xinlei [1 ]
Tang, Jialiang [2 ]
Jiang, Ning [1 ]
机构
[1] Southwest Univ Sci & Technol, Sch Comp Sci & Technol, Mianyang 621000, Sichuan, Peoples R China
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Jiangsu, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT IV | 2025年 / 15034卷
关键词
Knowledge distillation; Long-tailed learning; Data augmentation;
D O I
10.1007/978-981-97-8505-6_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge Distillation (KD) is employed to transfer knowledge from a vast pre-trained teacher network to a reliable and lightweight student network on balanced data, facilitating effective deployment on hardware with limited resources. However, real-world data often exhibits an imbalanced long-tail distribution. The teacher network trained on this data also exhibits imbalanced prediction distributions. The imbalance in both the signals mentioned above significantly impacts student network performance, making training a reliable student network challenging. In this paper, we propose a Virtual Student Distribution Knowledge Distillation (VSD-KD) to alleviate the problem of imbalanced supervision signals. Specifically, we match the teacher imbalance prediction distribution with the virtual student distribution to improve the performance of knowledge transfer, thus mitigating the impact of teacher signal imbalance. Additionally, we adopt methods of class-balanced sampling and class-adaptive data augmentation to reduce the generation of error messages in the head class, thereby alleviating the impact of data signal imbalance. Intensive experiments demonstrate that our method can train a reliable student network on long-tail datasets.
引用
收藏
页码:406 / 419
页数:14
相关论文
共 36 条
[1]   Lightweight network support for scalable end-to-end services [J].
Calvert, KL ;
Griffioen, J ;
Wen, S .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2002, 32 (04) :265-278
[2]  
Cao KD, 2019, ADV NEUR IN, V32
[3]   Knowledge Distillation with the Reused Teacher Classifier [J].
Chen, Defang ;
Mei, Jian-Ping ;
Zhang, Hailin ;
Wang, Can ;
Feng, Yan ;
Chen, Chun .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :11923-11932
[4]   Randaugment: Practical automated data augmentation with a reduced search space [J].
Cubuk, Ekin D. ;
Zoph, Barret ;
Shlens, Jonathon ;
Le, Quoc, V .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, :3008-3017
[5]   AutoAugment: Learning Augmentation Strategies from Data [J].
Cubuk, Ekin D. ;
Zoph, Barret ;
Mane, Dandelion ;
Vasudevan, Vijay ;
Le, Quoc V. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :113-123
[6]   Class-Balanced Loss Based on Effective Number of Samples [J].
Cui, Yin ;
Jia, Menglin ;
Lin, Tsung-Yi ;
Song, Yang ;
Belongie, Serge .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9260-9269
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]   Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions [J].
Du, Fei ;
Yang, Peng ;
Jia, Qi ;
Nan, Fengtao ;
Chen, Xiaoting ;
Yang, Yun .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :15814-15823
[9]   Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning [J].
Han, H ;
Wang, WY ;
Mao, BH .
ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 :878-887
[10]   Learning from Imbalanced Data [J].
He, Haibo ;
Garcia, Edwardo A. .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (09) :1263-1284