Anytime bottom-up rule learning for large-scale knowledge graph completion

被引：5

作者：

Meilicke, Christian ^{[1
]}

Chekol, Melisachew Wudage ^{[2
]}

Betz, Patrick ^{[1
]}

Fink, Manuel ^{[1
]}

Stuckeschmidt, Heiner ^{[1
]}

机构：

[1] Univ Mannheim, Mannheim, Germany

[2] Univ Utrecht, Utrecht, Netherlands

来源：

VLDB JOURNAL | 2024年 / 33卷 / 01期

关键词：

Knowledge graph completion; Link prediction; Rule learning;

D O I：

10.1007/s00778-023-00800-5

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Knowledge graph completion is the task of predicting correct facts that can be expressed by the vocabulary of a given knowledge graph, which are not explicitly stated in that graph. Broadly, there are two main approaches for solving the knowledge graph completion problem. Sub-symbolic approaches embed the nodes and/or edges of a given graph into a low-dimensional vector space and use a scoring function to determine the plausibility of a given fact. Symbolic approaches learn a model that remains within the primary representation of the given knowledge graph. Rule-based approaches are well-known examples. One such approach is AnyBURL. It works by sampling random paths, which are generalized into Horn rules. Previously published results show that the prediction quality of AnyBURL is close to current state of the art with the additional benefit of offering an explanation for a predicted fact. In this paper, we propose several improvements and extensions of AnyBURL. In particular, we focus on AnyBURL's capability to be successfully applied to large and very large datasets. Overall, we propose four separate extensions: (i) We add to each rule a set of pairwise inequality constraints which enforces that different variables cannot be grounded by the same entities, which results into more appropriate confidence estimations. (ii) We introduce reinforcement learning to guide path sampling in order to use available computational resources more efficiently. (iii) We propose an efficient sampling strategy to approximate the confidence of a rule instead of computing its exact value. (iv) We develop a new multithreaded AnyBURL, which incorporates all previously mentioned modifications. In an experimental study, we show that our approach outperforms both symbolic and sub-symbolic approaches in large-scale knowledge graph completion. It has a higher prediction quality and requires significantly less time and computational resources.

引用

页码：131 / 161

页数：31

共 67 条

[11] Chen MH, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1511
[12] Ontological Pathfinding: Mining First-Order Knowledge from Large Knowledge Bases
Chen, Yang
Goldberg, Sean
Wang, Daisy Zhe
Johri, Soumitra Siddharth
[J]. SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 835 - 846
[13] ScaLeKB: scalable learning and inference over large knowledge bases
Chen, Yang
Wang, Daisy Zhe
Goldberg, Sean
[J]. VLDB JOURNAL, 2016, 25 (06) : 893 - 918
[14] Das R., 2018, 6 INT C LEARNING REP
[15] De Raedt L, 2008, COGN TECHNOL, P1
[16] Dehaspe L, 2001, RELATIONAL DATA MINING, P189
[17] Dettmers T, 2018, AAAI CONF ARTIF INTE, P1811
[18] Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion
Dong, Xin Luna
Gabrilovich, Evgeniy
Heitz, Geremy
Horn, Wilko
Lao, Ni
Murphy, Kevin
Strohmann, Thomas
Sun, Shaohua
Zhang, Wei
[J]. PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 601 - 610
[19] Duchi J, 2011, J MACH LEARN RES, V12, P2121
[20] Discovering Association Rules from Big Graphs
Fan, Wenfei
Fu, Wenzhi
Jin, Ruochun
Lu, Ping
Tian, Chao
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (07): : 1479 - 1492

← 1 2 3 4 5 6 7 →