Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

被引：65

作者：

Akrami, Farahnaz ^{[1
]}

Saeef, Mohammed Samiul ^{[1
]}

Zhang, Qingheng ^{[2
]}

Hu, Wei ^{[2
]}

Li, Chengkai ^{[1
]}

机构：

[1] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA

[2] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Peoples R China

来源：

SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA | 2020年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1145/3318464.3380599

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the active research area of employing embedding models for knowledge graph completion, particularly for the task of link prediction, most prior studies used two benchmark datasets FB15k and WN18 in evaluating such models. Most triples in these and other datasets in such studies belong to reverse and duplicate relations which exhibit high data redundancy due to semantic duplication, correlation or data incompleteness. This is a case of excessive data leakage-a model is trained using features that otherwise would not be available when the model needs to be applied for real prediction. There are also Cartesian product relations for which every triple formed by the Cartesian product of applicable subjects and objects is a true fact. Link prediction on the aforementioned relations is easy and can be achieved with even better accuracy using straightforward rules instead of sophisticated embedding models. A more fundamental defect of these models is that the link prediction scenario, given such data, is non-existent in the real-world. This paper is the first systematic study with the main objective of assessing the true effectiveness of embedding models when the unrealistic triples are removed. Our experiment results show these models are much less accurate than what we used to perceive. Their poor accuracy renders link prediction a task without truly effective automated solution. Hence, we call for re-investigation of possible effective approaches.

引用

页码：1995 / 2010

页数：16

共 41 条

[1] Re-evaluating Embedding-Based Knowledge Graph Completion Methods [J].

Akrami, Farahnaz ;

Guo, Lingbing ;

Hu, Wei ;

Li, Chengkai .

CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, :1779-1782

[2]

[Anonymous], 2015, ICLR

[3]

[Anonymous], 2013, P 2013 C EMP METH NA

[4]

[Anonymous], 2015, C INN DAT SYST RES C

[5] DBpedia: A nucleus for a web of open data [J].

Auer, Soeren ;

Bizer, Christian ;

Kobilarov, Georgi ;

Lehmann, Jens ;

Cyganiak, Richard ;

Ives, Zachary .

SEMANTIC WEB, PROCEEDINGS, 2007, 4825 :722-+

[6]

Balazevic Ivana, 2020, 2019 C EMP METH NAT, P5185, DOI 10.18653/v1/D19- 1522

[7]

Bordes A., 2013, P 27 ANN C NEUR INF, P2787

[8] A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications [J].

Cai, HongYun ;

Zheng, Vincent W. ;

Chang, Kevin Chen-Chuan .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (09) :1616-1637

[9]

Carlson A, 2010, AAAI CONF ARTIF INTE, P1306

[10]

Das Rajarshi, 2018, P ICLR

← 1 2 3 4 5 →