A Teacher-Free Graph Knowledge Distillation Framework With Dual Self-Distillation

被引：1

作者：

Wu, Lirong ^{[1
]}

Lin, Haitao ^{[1
]}

Gao, Zhangyang ^{[1
]}

Zhao, Guojiang ^{[1
]}

Li, Stan Z. ^{[1
]}

机构：

[1] Westlake Univ, Res Ctr Ind Future, AI Lab, Hangzhou 310000, Peoples R China

来源：

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | 2024年 / 36卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Graph neural networks; Training; Self-supervised learning; Inference algorithms; Task analysis; Standards; Knowledge engineering; graph knowledge distillation; inference acceleration;

D O I：

10.1109/TKDE.2024.3374773

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent years have witnessed great success in handling graph-related tasks with Graph Neural Networks (GNNs). Despite their great academic success, Multi-Layer Perceptrons (MLPs) remain the primary workhorse for practical industrial applications. One reason for such an academic-industry gap is the neighborhood-fetching latency incurred by data dependency in GNNs. To reduce their gaps, Graph Knowledge Distillation (GKD) is proposed, usually based on a standard teacher-student architecture, to distill knowledge from a large teacher GNN into a lightweight student GNN or MLP. However, we found in this paper that neither teachers nor GNNs are necessary for graph knowledge distillation. We propose a Teacher-Free Graph Self-Distillation (TGS) framework that does not require any teacher model or GNNs during both training and inference. More importantly, the proposed TGS framework is purely based on MLPs, where structural information is only implicitly used to guide dual knowledge self-distillation between the target node and its neighborhood. As a result, TGS enjoys the benefits of graph topology awareness in training but is free from data dependency in inference. Extensive experiments have shown that the performance of vanilla MLPs can be greatly improved with dual self-distillation, e.g., TGS improves over vanilla MLPs by 15.54% on average and outperforms state-of-the-art GKD algorithms on six real-world datasets. In terms of inference speed, TGS infers 75x-89x faster than existing GNNs and 16x-25x faster than classical inference acceleration methods.

引用

页码：4375 / 4385

页数：11

共 50 条

[1] Iterative Graph Self-Distillation
Zhang, Hanlin
Lin, Shuai
Liu, Weiyang
Zhou, Pan
Tang, Jian
Liang, Xiaodan
Xing, Eric P.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (03) : 1161 - 1169
[2] Restructuring the Teacher and Student in Self-Distillation
Zheng, Yujie
Wang, Chong
Tao, Chenchen
Lin, Sunqi
Qian, Jiangbo
Wu, Jiafei
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5551 - 5563
[3] Teacher-free Distillation via Regularizing Intermediate Representation
Li, Lujun
Liang, Shiuan-Ni
Yang, Ya
Jin, Zhe
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[4] Reverse Self-Distillation Overcoming the Self-Distillation Barrier
Ni, Shuiping
Ma, Xinliang
Zhu, Mingfu
Li, Xingwang
Zhang, Yu-Dong
IEEE OPEN JOURNAL OF THE COMPUTER SOCIETY, 2023, 4 : 195 - 205
[5] Self-Regulated Feature Learning via Teacher-free Feature Distillation
Li, Lujun
COMPUTER VISION, ECCV 2022, PT XXVI, 2022, 13686 : 347 - 363
[6] Improving Low-Resource Neural Machine Translation With Teacher-Free Knowledge Distillation
Zhang, Xinlu
Li, Xiao
Yang, Yating
Dong, Rui
IEEE ACCESS, 2020, 8 : 206638 - 206645
[7] Teacher-Free Knowledge Distillation based on Non-Progressive Meta-Learned Simulated Annealing
Ho, Pin Hsuan
Jiang, Bing Ru
Lin, Albert S.
2024 IEEE 3RD INTERNATIONAL CONFERENCE ON COMPUTING AND MACHINE INTELLIGENCE, ICMI 2024, 2024,
[8] Unbiased scene graph generation using the self-distillation method
Bo Sun
Zhuo Hao
Lejun Yu
Jun He
The Visual Computer, 2024, 40 : 2381 - 2390
[9] KED: A Deep-Supervised Knowledge Enhancement Self-Distillation Framework for Model Compression
Lai, Yutong
Ning, Dejun
Liu, Shipeng
IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 831 - 835
[10] Self-distillation with model averaging
Gu, Xiaozhe
Zhang, Zixun
Jin, Ran
Goh, Rick Siow Mong
Luo, Tao
INFORMATION SCIENCES, 2025, 694

← 1 2 3 4 5 →