Named Entity Recognition by Using XLNet-BiLSTM-CRF

被引：46

作者：

Yan, Rongen ^{[1
]}

Jiang, Xue ^{[2
]}

Dang, Depeng ^{[1
]}

机构：

[1] Beijing Normal Univ, Sch Artificial Intelligence, Beijing 100875, Peoples R China

[2] Univ Sci & Technol Beijing, Beijing Adv Innovat Ctr Mat Genome Engn, Beijing 100083, Peoples R China

来源：

NEURAL PROCESSING LETTERS | 2021年 / 53卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Natural language process; Named entity recognition; XLNet; Bi-directional long short-term memory; Tagging;

D O I：

10.1007/s11063-021-10547-1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Named entity recognition (NER) is the basis for many natural language processing (NLP) tasks such as information extraction and question answering. The accuracy of the NER directly affects the results of downstream tasks. Most of the relevant methods are implemented using neural networks, however, the word vectors obtained from a small data set cannot describe unusual, previously-unseen entities accurately and the results are not sufficiently accurate. Recently, the use of XLNet as a new pre-trained model has yielded satisfactory results in many NLP tasks, integration of XLNet embeddings in existent NLP tasks is not straightforward. In this paper, a new neural network model is proposed to improve the effectiveness of the NER by using a pre-trained XLNet, bi-directional long-short term memory (Bi-LSTM) and conditional random field (CRF). Pre-trained XLNet model is used to extract sentence features, then the classic NER neural network model is combined with the obtained features. In addition, the superiority of XLNet in NER tasks is demonstrated. We evaluate our model on the CoNLL-2003 English dataset and WNUT-2017 and show that the XLNet-BiLSTM-CRF obtains state-of-the-art results.

引用

页码：3339 / 3356

页数：18

共 47 条

[1]

Aguilar Gustavo, 2018, HLT NAACL, V1, P1401

[2]

Akbik A., 2018, P 27 INT C COMP LING, P1638

[3]

Akbik A, P 2019 C N AM CHAPT, V1, P724

[4]

Akbik A, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P724

[5]

Chen H, 2019, AAAI CONF ARTIF INTE, P6236

[6]

Collobert R, 2011, J MACH LEARN RES, V12, P2493

[7]

Dai Z., 2019, ARXIV PREPRINT ARXIV

[8]

Derczynski L., 2017, P 3 WORKSHOP NOISY U, P140, DOI 10.18653/v1/w17-4418

[9]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[10]

Dyer Chris, 2016, P NAACL

← 1 2 3 4 5 →