Transfer learning for fine-grained entity typing

被引：7

作者：

Hou, Feng ^{[1
]}

Wang, Ruili ^{[1
]}

Zhou, Yi ^{[2
]}

机构：

[1] Massey Univ, Sch Nat & Computat Sci, Albany, New Zealand

[2] Shanghai Res Ctr Brain Sci & Brain Inspired Intel, Zhangjiang Lab, Shanghai, Peoples R China

来源：

KNOWLEDGE AND INFORMATION SYSTEMS | 2021年 / 63卷 / 04期

关键词：

Transfer learning; Topic model; Language model; Topic anchor; Fine-grained entity typing;

D O I：

10.1007/s10115-021-01549-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Fine-grained entity typing (FGET) is to classify the mentions of entities into hierarchical fine-grained semantic types. There are two main issues with existing FGET approaches. Firstly, the process of training corpora for FGET is normally to label the data automatically, which inevitably induces noises. Existing approaches either directly tweak noisy labels in corpora by heuristics or algorithmically retreat to parental types, both leading to coarse-grained type labels instead of fine-grained ones. Secondly, existing approaches usually use recurrent neural networks to generate feature representations of mention phrases and their contexts, which, however, perform relatively poor on long contexts and out-of-vocabulary (OOV) words. In this paper, we propose a transfer learning-based approach to extract more efficient feature representations and offset label noises. More precisely, we adopt three transfer learning schemes: (i) transferring sub-word embeddings to generate more efficient OOV embeddings; (ii) using a pre-trained language model to generate more efficient context features; (iii) using a pre-trained topic model to transfer the topic-type relatedness through topic anchors and select confusing fine-grained types at inference time. The pre-trained topic model can offset the label noises without retreating to coarse-grained types. The experimental results demonstrate the effectiveness of our transfer learning approach for FGET.

引用

页码：845 / 866

页数：22

共 67 条

[1] Abhishek, 2017, 15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, P797
[2] Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[3] Baheti A, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P3970
[4] A deep transfer learning approach for improved post-traumatic stress disorder diagnosis
Banerjee, Debrup
Islam, Kazi
Xue, Keyi
Mei, Gang
Xiao, Lemin
Zhang, Guangfan
Xu, Roger
Lei, Cai
Ji, Shuiwang
Li, Jiang
[J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 60 (03) : 1693 - 1724
[5] Latent Dirichlet allocation
Blei, DM
Ng, AY
Jordan, MI
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
[6] Bojanowski P., 2017, T ASSOC COMPUT LING, V5, P135, DOI [10.1162/tacl_a_00051, DOI 10.1162/TACLA00051]
[7] Brown P. F., 1992, Computational Linguistics, V18, P467
[8] Choi E, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P87
[9] ELECTRA: PRE-TRAINING TEXT ENCODERS AS DISCRIMINATORS RATHER THAN GENERATORS
Clark, Kevin
Luong, Minh-Thang
Le, Quoc V.
Manning, Christopher D.
[J]. INFORMATION SYSTEMS RESEARCH, 2020,
[10] Collobert R., 2008, P 25 ICML, V25, P160, DOI [DOI 10.1145/1390156.1390177, 10.1145/1390156.1390177]

← 1 2 3 4 5 6 7 →