A data-centric framework of improving graph neural networks for knowledge graph embedding

被引:0
作者
Cao, Yanan [1 ,2 ]
Lin, Xixun [1 ]
Wu, Yongxuan [1 ,2 ]
Shi, Fengzhao [1 ,2 ]
Shang, Yanmin [1 ,2 ]
Tan, Qingfeng [3 ]
Zhou, Chuan [2 ,4 ]
Zhang, Peng [3 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
[3] Guangzhou Univ, Cyberspace Inst Adv Technol, Guangzhou, Peoples R China
[4] Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China
来源
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2025年 / 28卷 / 01期
基金
中国国家自然科学基金;
关键词
Knowledge graph embedding; Graph homophily; Graph neural networks; Data-centric learning;
D O I
10.1007/s11280-024-01320-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Knowledge Graph Embedding (KGE) aims to learn representations of entities and relations of knowledge graph (KG). Recently Graph Neural Networks (GNNs) have gained great success on KGE, but for the reason behind it, most views simply attribute to the well learning of knowledge graph structure, which still remains a limited understanding of the internal mechanism. In this work, we first study a fundamental problem, i.e., what are the important factors for GNNs to help KGE. To investigate this problem, we discuss the core idea of current GNN models for KG, and propose a new assumption of relational homophily that connected nodes possess similar features after relation's transforming, to explain why aggregating neighbors with relation can help KGE. Based on the model and empirical analyses, we then introduce a novel data-centric framework for applying GNNs to KGE called KSG-GNN. In KSG-GNN, we construct a new graph structure from KG named Knowledge Similarity Graph (KSG), where each node connects with its similar nodes as neighbors, and then we apply GNNs on this graph to perform KGE. Instead of following the relational homophily assumption in KG, KSG aligns with homogeneous graphs that can directly satisfy homophily assumption. Hence, any GNN developed on homogeneous graphs like GCN, GAT, GraphSAGE, etc., can be applied out-of-the-box as KSG-GNN without modification, which provides a more general and effective GNN paradigm. Finally, we conduct extensive experiments on two benchmark datasets, i.e., FB15k-237 and WN18RR, demonstrating the superior performance of KSG-GNN over multiple strong baselines. The source code is available at https://github.com/advancer99/WWWJ-KGE.
引用
收藏
页数:21
相关论文
共 72 条
[31]  
Liu N., 2022, P 36 ADV NEUR INF PR, P2972
[32]  
Luo YZ, 2021, PR MACH LEARN RES, V139
[33]   Birds of a feather: Homophily in social networks [J].
McPherson, M ;
Smith-Lovin, L ;
Cook, JM .
ANNUAL REVIEW OF SOCIOLOGY, 2001, 27 :415-444
[34]  
Nathani D, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P4710
[35]  
Nickel M., 2011, P 28 INT C MACH LEAR, P809, DOI DOI 10.5555/3104482.3104584
[36]  
Peng XT, 2021, 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), P2364
[37]  
Rong Y, 2020, Arxiv, DOI arXiv:1907.10903
[38]  
Rusch TK, 2023, Arxiv, DOI arXiv:2303.10993
[39]  
Srinivasa RS, 2020, Arxiv, DOI arXiv:2006.08796
[40]   Modeling Relational Data with Graph Convolutional Networks [J].
Schlichtkrull, Michael ;
Kipf, Thomas N. ;
Bloem, Peter ;
van den Berg, Rianne ;
Titov, Ivan ;
Welling, Max .
SEMANTIC WEB (ESWC 2018), 2018, 10843 :593-607