Online adversarial knowledge distillation for graph neural networks

被引:4
作者
Wang, Can [1 ]
Wang, Zhe [1 ]
Chen, Defang [1 ]
Zhou, Sheng [1 ]
Feng, Yan [1 ]
Chen, Chun [1 ]
机构
[1] Zhejiang Univ, Shanghai Inst Adv Study, Coll Comp Sci, ZJU Bangsun Joint Res Ctr, Hangzhou 310013, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Knowledge distillation; Graph neural networks; Dynamic graph; Online distillation; FRAMEWORK;
D O I
10.1016/j.eswa.2023.121671
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge distillation, a technique recently gaining popularity for enhancing model generalization in Convolutional Neural Networks (CNNs), operates under the assumption that both teacher and student models are trained on identical data distributions. However, its effect on Graph Neural Networks (GNNs) is less than satisfactory since the graph topology and node attributes are prone to evolve, thereby leading to the issue of distribution shift. In this paper, we tackle this challenge by simultaneously training a group of graph neural networks in an online distillation fashion, where the group knowledge plays a role as a dynamic virtual teacher and the structure changes in graph neural networks are effectively captured. To improve the distillation performance, two types of knowledge are transferred among the students to enhance each other: local knowledge reflecting information in the graph topology and node attributes, and global knowledge reflecting the prediction over classes. We transfer the global knowledge with KL-divergence as the vanilla knowledge distillation does, while exploiting the complicated structure of the local knowledge with an efficient adversarial cyclic learning framework. Extensive experiments verified the effectiveness of our proposed online adversarial distillation approach. The code is published at https://github.com/wangz3066/OnlineDistillGCN.
引用
收藏
页数:12
相关论文
共 48 条
[1]  
[Anonymous], 2018, ICML 2018 WORKSH THE
[2]  
Bojchevski A, 2018, PR MACH LEARN RES, V80
[3]  
Bruna J., 2013, INT C LEARNING REPRE
[4]   Knowledge Distillation with the Reused Teacher Classifier [J].
Chen, Defang ;
Mei, Jian-Ping ;
Zhang, Hailin ;
Wang, Can ;
Feng, Yan ;
Chen, Chun .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :11923-11932
[5]  
Chen DF, 2021, AAAI CONF ARTIF INTE, V35, P7028
[6]  
Chen DF, 2020, AAAI CONF ARTIF INTE, V34, P3430
[7]  
Chen TL, 2021, INT C MACHINE LEARNI, V139
[8]  
Chen Xi, 2016, Proceedings of the 30th International Conference on Neural Information Processing Systems, V29
[9]  
Chung I, 2020, PR MACH LEARN RES, V119
[10]   Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].
Dai, Angela ;
Qi, Charles Ruizhongtai ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554