A data-centric framework of improving graph neural networks for knowledge graph embedding

被引:0
作者
Cao, Yanan [1 ,2 ]
Lin, Xixun [1 ]
Wu, Yongxuan [1 ,2 ]
Shi, Fengzhao [1 ,2 ]
Shang, Yanmin [1 ,2 ]
Tan, Qingfeng [3 ]
Zhou, Chuan [2 ,4 ]
Zhang, Peng [3 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
[3] Guangzhou Univ, Cyberspace Inst Adv Technol, Guangzhou, Peoples R China
[4] Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China
来源
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2025年 / 28卷 / 01期
基金
中国国家自然科学基金;
关键词
Knowledge graph embedding; Graph homophily; Graph neural networks; Data-centric learning;
D O I
10.1007/s11280-024-01320-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Knowledge Graph Embedding (KGE) aims to learn representations of entities and relations of knowledge graph (KG). Recently Graph Neural Networks (GNNs) have gained great success on KGE, but for the reason behind it, most views simply attribute to the well learning of knowledge graph structure, which still remains a limited understanding of the internal mechanism. In this work, we first study a fundamental problem, i.e., what are the important factors for GNNs to help KGE. To investigate this problem, we discuss the core idea of current GNN models for KG, and propose a new assumption of relational homophily that connected nodes possess similar features after relation's transforming, to explain why aggregating neighbors with relation can help KGE. Based on the model and empirical analyses, we then introduce a novel data-centric framework for applying GNNs to KGE called KSG-GNN. In KSG-GNN, we construct a new graph structure from KG named Knowledge Similarity Graph (KSG), where each node connects with its similar nodes as neighbors, and then we apply GNNs on this graph to perform KGE. Instead of following the relational homophily assumption in KG, KSG aligns with homogeneous graphs that can directly satisfy homophily assumption. Hence, any GNN developed on homogeneous graphs like GCN, GAT, GraphSAGE, etc., can be applied out-of-the-box as KSG-GNN without modification, which provides a more general and effective GNN paradigm. Finally, we conduct extensive experiments on two benchmark datasets, i.e., FB15k-237 and WN18RR, demonstrating the superior performance of KSG-GNN over multiple strong baselines. The source code is available at https://github.com/advancer99/WWWJ-KGE.
引用
收藏
页数:21
相关论文
共 72 条
[1]  
Arora S, 2020, Arxiv, DOI arXiv:2007.12374
[2]  
Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]
[3]  
Balazevic I, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P5185
[4]  
Balin M.F., 2024, Adv. Neural Inf. Process. Syst., V36
[5]  
Bansal T, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P4387
[6]  
Bo DY, 2021, AAAI CONF ARTIF INTE, V35, P3950
[7]  
Bordes A., 2013, ADV NEURAL INFORM PR, V26, P2787, DOI DOI 10.5555/2999792.2999923
[8]  
Chao LL, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, P4360
[9]   Stoichiometry design in hierarchical CoNiFe phosphide for highly efficient water oxidation [J].
Chen, Jiangbo ;
Ying, Jie ;
Xiao, Yuxuan ;
Dong, Yuan ;
Ozoemena, Kenneth, I ;
Lenaerts, Silvia ;
Yang, Xiaoyu .
SCIENCE CHINA-MATERIALS, 2022, 65 (10) :2685-2693
[10]  
Chen J, 2018, Arxiv, DOI arXiv:1801.10247