Improving Distantly-Supervised Named Entity Recognition for Traditional Chinese Medicine Text via a Novel Back-Labeling Approach

被引:17
作者
Zhang, Dezheng [1 ,2 ]
Xia, Chao [1 ,2 ]
Xu, Cong [1 ,2 ]
Jia, Qi [1 ,2 ]
Yang, Shibing [1 ,2 ]
Luo, Xiong [1 ,2 ]
Xie, Yonghong [1 ,2 ]
机构
[1] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Dept Comp, Beijing 100083, Peoples R China
[2] Univ Sci & Technol Beijing, Beijing Key Lab Knowledge Engn Mat Sci, Beijing 100083, Peoples R China
关键词
Task analysis; Vocabulary; Text recognition; Labeling; Tagging; Neural networks; Manuals; Back-labeling approach; distant supervision; named entity recognition; traditional Chinese medicine (TCM);
D O I
10.1109/ACCESS.2020.3015056
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent advances in deep neural networks (DNNs) have enabled us to achieve reliable named entity recognition (NER) models without handcrafting features. However, these are also some obstacles imposed by using those machine learning methods, in need of a large amount of manually labeled data. To avoid such limitations, we could replace human annotation with distant supervision, however there remain a technical challenge on the error label issue caused by ignoring the entities that are not included in the vocabulary, which should be addressed to achieve the effective NER model. Then, we propose a novel back-labeling approach and integrate it into a tagging scheme, especially, we apply this scheme to handle the NER task in traditional Chinese medicine (TCM) field. In addition, we discuss how to use distant supervision methods to achieve better performance of the NER model. We conduct some experiments and verify that our scheme can effectively improve the entity recognition on the basis of distant supervision.
引用
收藏
页码:145413 / 145421
页数:9
相关论文
共 20 条
[1]  
[Anonymous], 2017, Ph.D. Thesis
[2]   De-identification of patient notes with recurrent neural networks [J].
Dernoncourt, Franck ;
Lee, Ji Young ;
Uzuner, Ozlem ;
Szolovits, Peter .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2017, 24 (03) :596-606
[3]  
Finkel Jenny Rose, 2005, P 43 ANN M ASS COMPU, P363
[4]  
Fries Jason, 2017, CoRR
[5]  
Giannakopoulos Athanasios, 2017, P 8 WORKSHOP COMPUTA, P180
[6]   Recurrent Convolutional Neural Networks for AMR Steganalysis Based on Pulse Position [J].
Gong, Chen ;
Yi, Xiaowei ;
Zhao, Xianfeng ;
Ma, Yi .
IH&MMSEC '19: PROCEEDINGS OF THE ACM WORKSHOP ON INFORMATION HIDING AND MULTIMEDIA SECURITY, 2019, :2-13
[7]  
Lample G., 2016, P 2016 C N AM CHAPT, P260
[8]  
Leaman Robert, 2008, Pac Symp Biocomput, P652
[9]  
Ma XZ, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P1064
[10]  
Peng M., 2019, ARXIV190601378