MEM-KGC: Masked Entity Model for Knowledge Graph Completion With Pre-Trained Language Model

被引：25

作者：

Choi, Bonggeun ^{[1
]}

Jang, Daesik ^{[2
]}

Ko, Youngjoong ^{[2
]}

机构：

[1] Sungkyunkwan Univ, Dept Elect & Comp Engn, Suwon 16419, Gyeonggi Do, South Korea

[2] Sungkyunkwan Univ, Dept Comp Sci & Engn, Suwon 16419, Gyeonggi Do, South Korea

来源：

IEEE ACCESS | 2021年 / 9卷

关键词：

Task analysis; Predictive models; Training; Bit error rate; Semantics; Micromechanical devices; Licenses; Knowledge graph completion; link prediction; masked language model; pre-trained language model;

D O I：

10.1109/ACCESS.2021.3113329

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The knowledge graph completion (KGC) task aims to predict missing links in knowledge graphs. Recently, several KGC models based on translational distance or semantic matching methods have been proposed and have achieved meaningful results. However, existing models have a significant shortcoming-they cannot train entity embedding when an entity does not appear in the training phase. As a result, such models use randomly initialized embeddings for entities that are unseen in the training phase and cause a critical decrease in performance during the test phase. To solve this problem, we propose a new approach that performs KGC task by utilizing the masked language model (MLM) that is used for a pre-trained language model. Given a triple (head entity, relation, tail entity), we mask the tail entity and consider the head entity and the relation as a context for the tail entity. The model then predicts the masked entity from among all entities. Then, the task is conducted by the same process as an MLM, which predicts a masked token with a given context of tokens. Our experimental results show that the proposed model achieves significantly improved performances when unseen entities appear during the test phase and achieves state-of-the-art performance on the WN18RR dataset.

引用

页码：132025 / 132032

页数：8

共 25 条

[1]

[Anonymous], 2014, P 2 INT C LEARN REPR

[2]

Bollacker K, 2008, SIGMOD, P1247, DOI DOI 10.1145/1376616.1376746

[3]

Bordes A., 2013, Advances in neural information processing systems, P2787

[4]

Cao ZS, 2021, AAAI CONF ARTIF INTE, V35, P6894

[5]

De Cao N, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P2306

[6]

Dettmers T, 2018, AAAI CONF ARTIF INTE, P1811

[7]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[8]

Huang H, 2020, P 28 INT C COMP LING, P556, DOI [10.18653/v1/2020.coling-main.48, DOI 10.18653/V1/2020.COLING-MAIN.48]

[9] Knowledge Graph Embedding Based Question Answering [J].

Huang, Xiao ;

Zhang, Jingyuan ;

Li, Dingcheng ;

Li, Ping .

PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, :105-113

[10]

Kim Bosung, 2020, P 28 INT C COMP LING, P1737, DOI 10.18653/v1/2020. coling-main.153

← 1 2 3 →