CAFE: Knowledge graph completion using neighborhood-aware features

被引:24
作者
Borrego, Agustin [1 ]
Ayala, Daniel [1 ]
Hernandez, Inma [1 ]
Rivero, Carlos R. [2 ]
Ruiz, David [1 ]
机构
[1] Univ Seville, ETS Ingn Informat, Avda Reina Mercedes S-N, Seville, Spain
[2] Rochester Inst Technol, 92 Lomb Mem Dr, Rochester, NY 14623 USA
关键词
Knowledge Graphs; Knowledge graph completion; Link prediction; Machine learning;
D O I
10.1016/j.engappai.2021.104302
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Knowledge Graphs (KGs) currently contain a vast amount of structured information in the form of entities and relations. Because KGs are often constructed automatically by means of information extraction processes, they may miss information that was either not present in the original source or not successfully extracted. As a result, KGs might lack useful and valuable information. Current approaches that aim to complete missing information in KGs have two main drawbacks. First, some have a dependence on embedded representations, which impose a very expensive preprocessing step and need to be recomputed again as the KG grows. Second, others are based on long random paths that may not cover all relevant information, whereas exhaustively analyzing all possible paths between entities is very time-consuming. In this paper, we present an approach to complete KGs based on evaluating candidate triples using a set of neighborhood-based features. Our approach exploits the highly connected nature of KGs by analyzing the entities and relations surrounding any given pair of entities, while avoiding full recomputations as new entities are added. Our results indicate that our proposal is able to identify correct triples with a higher effectiveness than other state-of-the-art approaches, achieving higher average F1 scores in all tested datasets. Therefore, we conclude that the information present in the vicinities of the two entities within a candidate triple can be leveraged to determine whether that triple is missing from the KG or not.
引用
收藏
页数:10
相关论文
共 48 条
[1]  
Aggarwal CharuC., 2012, MINING TEXT DATA, DOI 10.1007/978-1-4614-3223-4_6
[2]   Recognition of COVID-19 disease from X-ray images by hybrid model consisting of 2D curvelet transform, chaotic salp swarm algorithm and deep learning technique [J].
Altan, Aytac ;
Karasu, Seckin .
CHAOS SOLITONS & FRACTALS, 2020, 140
[3]  
[Anonymous], 2012, PROC 27 ANN ACM S AP, DOI DOI 10.1145/2245276.2245341
[4]  
[Anonymous], 2017, EMNLP
[5]  
[Anonymous], 2013, ADV NEURAL INFORM PR
[6]   AYNEC: All You Need for Evaluating Completion Techniques in Knowledge Graphs [J].
Ayala, Daniel ;
Borrego, Agustin ;
Hernandez, Inma ;
Rivero, Carlos R. ;
Ruiz, David .
SEMANTIC WEB, ESWC 2019, 2019, 11503 :397-411
[7]   TAPON: A two-phase machine learning approach for semantic labelling [J].
Ayala, Daniel ;
Hernandez, Inma ;
Ruiz, David ;
Toro, Miguel .
KNOWLEDGE-BASED SYSTEMS, 2019, 163 :931-943
[8]   The Impact of Negative Triple Generation Strategies and Anomalies on Knowledge Graph Completion [J].
Bansal, Iti ;
Tiwari, Sudhanshu ;
Rivero, Carlos R. .
CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, :45-54
[9]  
Bollacker K., 2008, SIGMOD, P1247
[10]  
Bordes A., 2013, ADV NEURAL INFORM PR, P2787