A Korean named entity recognition method using Bi-LSTM-CRF and masked self-attention

被引:0
作者
Jin, Guozhe [1 ,2 ]
Yu, Zhezhou [1 ]
机构
[1] [1,Jin, Guozhe
[2] Yu, Zhezhou
来源
Yu, Zhezhou (yuzz@jlu.edu.cn) | 1600年 / Academic Press卷 / 65期
关键词
Long short-term memory;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Named entity recognition (NER) is a fundamental task in natural language processing. The existing Korean NER methods use the Korean morpheme, syllable sequence, and part-of-speech as features, and use a sequence labeling model to tackle this problem. In Korean, on one hand, morpheme itself contains strong indicative information of named entity (especially for time and person). On the other hand, the context of the target morpheme plays an important role in recognizing the named entity(NE) tag of the target morpheme. To make full use of these two features, we propose two auxiliary tasks. One of them is the morpheme-level NE tagging task which will capture the NE feature of syllable sequence composing morpheme. The other one is the context-based NE tagging task which aims to capture the context feature of target morpheme through the masked self-attention network. These two tasks are jointly trained with Bi-LSTM-CRF NER Tagger. The experimental results on Klpexpo 2016 corpus and Naver NLP Challenge 2018 corpus show that our model outperforms the strong baseline systems and achieves the state of the art. © 2020 Elsevier Ltd
引用
收藏
相关论文
empty
未找到相关数据