Distribution Consistency based Self-Training for Graph Neural Networks with Sparse Labels

被引:4
作者
Wang, Fali [1 ]
Zhao, Tianxiang [1 ]
Wang, Suhang [1 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
来源
PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024 | 2024年
基金
美国国家科学基金会;
关键词
Self-Training; Distribution Shifts; Graph Neural Networks; CONVOLUTIONAL NETWORKS;
D O I
10.1145/3616855.3635793
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot node classification poses a significant challenge for Graph Neural Networks (GNNs) due to insufficient supervision and potential distribution shifts between labeled and unlabeled nodes. Self-training has emerged as a widely popular framework to leverage the abundance of unlabeled data, which expands the training set by assigning pseudo-labels to selected unlabeled nodes. Efforts have been made to develop various selection strategies based on confidence, information gain, etc. However, none of these methods takes into account the distribution shift between the training and testing node sets. The pseudo-labeling step may amplify this shift and even introduce new ones, hindering the effectiveness of self-training. Therefore, in this work, we explore the potential of explicitly bridging the distribution shift between the expanded training set and test set during self-training. To this end, we propose a novel Distribution-Consistent Graph Self-Training (DC-GST) framework to identify pseudo-labeled nodes that both are informative and capable of redeeming the distribution discrepancy and formulate it as a differentiable optimization task. A distribution shift-aware edge predictor is further adopted to augment the graph and increase the model's generalizability in assigning pseudo labels. We evaluate our proposed method on four publicly available benchmark datasets and extensive experiments demonstrate that our framework consistently outperforms state-of-the-art baselines.
引用
收藏
页码:712 / 720
页数:9
相关论文
共 40 条
[1]  
Abu-El-Haifa S, 2019, PR MACH LEARN RES, V97
[2]  
Bevilacqua B, 2021, PR MACH LEARN RES, V139
[3]  
Biewald Lukas, 2020, Experiment tracking with weights and biases
[4]  
Bo DY, 2021, AAAI CONF ARTIF INTE, V35, P3950
[5]  
Brody S., 2022, INT C LEARN REPR
[6]  
Dai EY, 2023, Arxiv, DOI [arXiv:2204.08570, 10.48550/arXiv.2204.08570, DOI 10.48550/ARXIV.2204.08570]
[7]  
Ding KZ, 2022, Arxiv, DOI arXiv:2208.12422
[8]  
Ganin Y, 2016, J MACH LEARN RES, V17
[9]   Rank-based self-training for graph convolutional networks [J].
Guimaraes Pedronette, Daniel Carlos ;
Latecki, Longin Jan .
INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (02)
[10]  
Hamilton WL, 2017, ADV NEUR IN, V30