Informative pseudo-labeling for graph neural networks with few labels

被引:23
作者
Li, Yayong [1 ]
Yin, Jie [2 ]
Chen, Ling [1 ]
机构
[1] Univ Technol Sydney, Australian Artificial Intelligence Inst, Sydney, NSW, Australia
[2] Univ Sydney, Discipline Business Analyt, Sydney, NSW, Australia
基金
澳大利亚研究理事会;
关键词
Graph neural networks; Pseudo-labeling; Mutual information maximization; CONVOLUTIONAL NETWORKS;
D O I
10.1007/s10618-022-00879-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph neural networks (GNNs) have achieved state-of-the-art results for semi-supervised node classification on graphs. Nevertheless, the challenge of how to effectively learn GNNs with very few labels is still under-explored. As one of the prevalent semi-supervised methods, pseudo-labeling has been proposed to explicitly address the label scarcity problem. It is the process of augmenting the training set with pseudo-labeled unlabeled nodes to retrain a model in a self-training cycle. However, the existing pseudo-labeling approaches often suffer from two major drawbacks. First, these methods conservatively expand the label set by selecting only high-confidence unlabeled nodes without assessing their informativeness. Second, these methods incorporate pseudo-labels to the same loss function with genuine labels, ignoring their distinct contributions to the classification task. In this paper, we propose a novel informative pseudo-labeling framework (InfoGNN) to facilitate learning of GNNs with very few labels. Our key idea is to pseudo-label the most informative nodes that can maximally represent the local neighborhoods via mutual information maximization. To mitigate the potential label noise and class-imbalance problem arising from pseudo-labeling, we also carefully devise a generalized cross entropy with a class-balanced regularization to incorporate pseudo-labels into model retraining. Extensive experiments on six real-world graph datasets validate that our proposed approach significantly outperforms state-of-the-art baselines and competitive self-supervised methods on graphs.
引用
收藏
页码:228 / 254
页数:27
相关论文
共 39 条
[1]  
Belghazi MI, 2018, PR MACH LEARN RES, V80
[2]  
Bojchevski Aleksandar., 2017, DEEP GAUSSIAN EMBEDD
[3]   AN ANALYSIS OF TRANSFORMATIONS [J].
BOX, GEP ;
COX, DR .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1964, 26 (02) :211-252
[4]   Graph Prototypical Networks for Few-shot Learning on Attributed Networks [J].
Ding, Kaize ;
Wang, Jianling ;
Li, Jundong ;
Shu, Kai ;
Liu, Chenghao ;
Liu, Huan .
CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, :295-304
[5]  
Goldberger J., 2016, INT C LEARNING REPRE
[6]  
Hamilton WL, 2017, ADV NEUR IN, V30
[7]   Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels [J].
Han, Bo ;
Yao, Quanming ;
Yu, Xingrui ;
Niu, Gang ;
Xu, Miao ;
Hu, Weihua ;
Tsang, Ivor W. ;
Sugiyama, Masashi .
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[8]  
Hassani K, 2020, PR MACH LEARN RES, V119
[9]  
Hjelm R.D., 2019, INT C LEARNING REPRE
[10]  
Huang Kexin, 2020, ADV NEURAL INFORM PR, V33