Learning to select pseudo labels: a semi-supervised method for named entity recognition

被引:0
作者
Zhen-zhen Li
Da-wei Feng
Dong-sheng Li
Xi-cheng Lu
机构
[1] National University of Defense Technology,College of Computer
来源
Frontiers of Information Technology & Electronic Engineering | 2020年 / 21卷
关键词
Named entity recognition; Unlabeled data; Deep learning; Semi-supervised method; TP391.1;
D O I
暂无
中图分类号
学科分类号
摘要
Deep learning models have achieved state-of-the-art performance in named entity recognition (NER); the good performance, however, relies heavily on substantial amounts of labeled data. In some specific areas such as medical, financial, and military domains, labeled data is very scarce, while unlabeled data is readily available. Previous studies have used unlabeled data to enrich word representations, but a large amount of entity information in unlabeled data is neglected, which may be beneficial to the NER task. In this study, we propose a semi-supervised method for NER tasks, which learns to create high-quality labeled data by applying a pre-trained module to filter out erroneous pseudo labels. Pseudo labels are automatically generated for unlabeled data and used as if they were true labels. Our semi-supervised framework includes three steps: constructing an optimal single neural model for a specific NER task, learning a module that evaluates pseudo labels, and creating new labeled data and improving the NER model iteratively. Experimental results on two English NER tasks and one Chinese clinical NER task demonstrate that our method further improves the performance of the best single neural model. Even when we use only pre-trained static word embeddings and do not rely on any external knowledge, our method achieves comparable performance to those state-of-the-art models on the CoNLL-2003 and OntoNotes 5.0 English NER tasks.
引用
收藏
页码:903 / 916
页数:13
相关论文
共 15 条
[1]  
Chawla NV(2002)SMOTE: synthetic minority over-sampling technique J Artif Intell Res 16 321-357
[2]  
Bowyer KW(2016)Named entity recognition with bidirectional LSTM-CNNs Trans Assoc Comput Ling 4 357-370
[3]  
Hall LO(2011)Natural language processing (almost) from scratch J Mach Learn Res 12 2493-2537
[4]  
Chiu JPC(1995)Support-vector networks Mach Learn 20 273-297
[5]  
Nichols E(1997)Long short-term memory Neur Comput 9 1735-1780
[6]  
Collobert R(2015)Deep learning in neural networks: an overview Neur Netw 61 85-117
[7]  
Weston J(2018)Semi-supervised deep learning using pseudo labels for hyperspectral image classification IEEE Trans Image Process 27 1259-1270
[8]  
Bottou L(undefined)undefined undefined undefined undefined-undefined
[9]  
Cortes C(undefined)undefined undefined undefined undefined-undefined
[10]  
Vapnik V(undefined)undefined undefined undefined undefined-undefined