Boosting Graph Neural Networks via Adaptive Knowledge Distillation

被引:0
作者
Guo, Zhichun [1 ]
Zhang, Chunhui [2 ]
Fan, Yujie [3 ]
Tian, Yijun [1 ]
Zhang, Chuxu [2 ]
Chawla, Nitesh V. [1 ]
机构
[1] Univ Notre Dame, Notre Dame, IN 46556 USA
[2] Brandeis Univ, Waltham, MA 02453 USA
[3] Case Western Reserve Univ, Cleveland, OH 44106 USA
来源
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6 | 2023年
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph neural networks (GNNs) have shown remarkable performance on diverse graph mining tasks. While sharing the same message passing framework, our study shows that different GNNs learn distinct knowledge from the same graph. This implies potential performance improvement by distilling the complementary knowledge from multiple models. However, knowledge distillation (KD) transfers knowledge from high-capacity teachers to a lightweight student, which deviates from our scenario: GNNs are often shallow. To transfer knowledge effectively, we need to tackle two challenges: how to transfer knowledge from compact teachers to a student with the same capacity; and, how to exploit student GNN's own learning ability. In this paper, we propose a novel adaptive KD framework, called BGNN, which sequentially transfers knowledge from multiple GNNs into a student GNN. We also introduce an adaptive temperature module and a weight boosting module. These modules guide the student to the appropriate knowledge for effective learning. Extensive experiments have demonstrated the effectiveness of BGNN. In particular, we achieve up to 3.05% improvement for node classification and 6.35% improvement for graph classification over vanilla GNNs.
引用
收藏
页码:7793 / 7801
页数:9
相关论文
共 44 条
[1]  
Alon U, 2021, Arxiv, DOI arXiv:2006.05205
[2]  
Balcilar M., 2020, ICLR
[3]   Statistical modeling: The two cultures [J].
Breiman, L .
STATISTICAL SCIENCE, 2001, 16 (03) :199-215
[4]  
Breiman L., 1996, Technical Report 1
[5]  
Deng X, 2021, PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, P2321
[6]   FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks [J].
Feng, Kaituo ;
Li, Changsheng ;
Yuan, Ye ;
Wang, Guoren .
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, :357-366
[7]  
Freund Y., 1999, J JAPANESE SOC ARTIF
[8]  
Furlanello T, 2018, PR MACH LEARN RES, V80
[9]  
Gilmer J, 2017, PR MACH LEARN RES, V70
[10]  
Guo ZC, 2022, Arxiv, DOI arXiv:2210.05801