The Staged Knowledge Distillation in Video Classification: Harmonizing Student Progress by a Complementary Weakly Supervised Framework

被引：2

作者：

Wang, Chao ^{[1
]}

Tang, Zheng ^{[2
]}

机构：

[1] China Acad Railway Sci, Beijing 100081, Peoples R China

[2] NVIDIA, Redmond, WA 98052 USA

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 08期

关键词：

Training; Uncertainty; Correlation; Generators; Data models; Task analysis; Computational modeling; Knowledge distillation; weakly supervised learning; teacher-student architecture; substage learning process; video classification; label-efficient learning;

D O I：

10.1109/TCSVT.2023.3294977

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In the context of label-efficient learning on video data, the distillation method and the structural design of the teacher-student architecture have a significant impact on knowledge distillation. However, the relationship between these factors has been overlooked in previous research. To address this gap, we propose a new weakly supervised learning framework for knowledge distillation in video classification that is designed to improve the efficiency and accuracy of the student model. Our approach leverages the concept of substage-based learning to distill knowledge based on the combination of student substages and the correlation of corresponding substages. We also employ the progressive cascade training method to address the accuracy loss caused by the large capacity gap between the teacher and the student. Additionally, we propose a pseudo-label optimization strategy to improve the initial data label. To optimize the loss functions of different distillation substages during the training process, we introduce a new loss method based on feature distribution. We conduct extensive experiments on both real and simulated data sets, demonstrating that our proposed approach outperforms existing distillation methods in terms of knowledge distillation for video classification tasks. Our proposed substage-based distillation approach has the potential to inform future research on label-efficient learning for video data.

引用

页码：6646 / 6660

页数：15

共 50 条

[41] Semi-weakly Supervised Learning for Prostate Cancer Image Classification with Teacher-Student Deep Convolutional Networks
Otalora, Sebastian
Marini, Niccolo
Mueller, Henning
Atzori, Manfredo
INTERPRETABLE AND ANNOTATION-EFFICIENT LEARNING FOR MEDICAL IMAGE COMPUTING, IMIMIC 2020, MIL3ID 2020, LABELS 2020, 2020, 12446 : 193 - 203
[42] Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models
Ashihara, Takanori
Moriya, Takafumi
Matsuura, Kohei
Tanaka, Tomohiro
INTERSPEECH 2022, 2022, : 411 - 415
[43] A Semi-Supervised Method for Grain Boundary Segmentation: Teacher-Student Knowledge Distillation and Pseudo-Label Repair
Huang, Yuanyou
Zhang, Xiaoxun
Ma, Fang
Li, Jiaming
Wang, Shuxian
ELECTRONICS, 2024, 13 (17)
[44] Joint learning method with teacher-student knowledge distillation for on-device breast cancer image classification
Sepahvand, Majid
Abdali-Mohammadi, Fardin
COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 155
[45] Multimodal Online Knowledge Distillation Framework for Land Use/Cover Classification Using Full or Missing Modalities
Liu, Xiao
Jin, Fei
Wang, Shuxiang
Rui, Jie
Zuo, Xibing
Yang, Xiaobing
Cheng, Chuanxiang
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 17
[46] Melanoma Breslow Thickness Classification Using Ensemble-Based Knowledge Distillation With Semi-Supervised Convolutional Neural Networks
Dominguez-Morales, Juan P.
Hernandez-Rodriguez, Juan-Carlos
Duran-Lopez, Lourdes
Conejo-Mir, Julian
Pereyra-Rodriguez, Jose-Juan
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2025, 29 (01) : 443 - 455
[47] Leveraging Prior-Knowledge for Weakly Supervised Object Detection Under a Collaborative Self-Paced Curriculum Learning Framework
Dingwen Zhang
Junwei Han
Long Zhao
Deyu Meng
International Journal of Computer Vision, 2019, 127 : 363 - 380
[48] Leveraging Prior-Knowledge for Weakly Supervised Object Detection Under a Collaborative Self-Paced Curriculum Learning Framework
Zhang, Dingwen
Han, Junwei
Zhao, Long
Meng, Deyu
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (04) : 363 - 380
[49] SSD-KD: A self-supervised diverse knowledge distillation method for lightweight skin lesion classification using dermoscopic images
Wang, Yongwei
Wang, Yuheng
Cai, Jiayue
Lee, Tim K.
Miao, Chunyan
Wang, Z. Jane
MEDICAL IMAGE ANALYSIS, 2023, 84
[50] Semi-supervised knowledge distillation framework for global-scale urban man-made object remote sensing mapping
Chen, Dingyuan
Ma, Ailong
Zhong, Yanfei
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2023, 122

← 1 2 3 4 5 →