Word Embedding Bootstrapped Deep Active Learning Method to Information Extraction on Chinese Electronic Medical Record

被引:4
作者
Ma Q. [1 ]
Cen X. [1 ]
Yuan J. [1 ]
Hou X. [1 ]
机构
[1] Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai
关键词
A; Chinese electronic medical record (EMR); deep active learning; information extraction; named entity recognition (NER); R; 319; word embedding;
D O I
10.1007/s12204-021-2285-5
中图分类号
学科分类号
摘要
Electronic medical record (EMR) containing rich biomedical information has a great potential in disease diagnosis and biomedical research. However, the EMR information is usually in the form of unstructured text, which increases the use cost and hinders its applications. In this work, an effective named entity recognition (NER) method is presented for information extraction on Chinese EMR, which is achieved by word embedding bootstrapped deep active learning to promote the acquisition of medical information from Chinese EMR and to release its value. In this work, deep active learning of bi-directional long short-term memory followed by conditional random field (Bi-LSTM+CRF) is used to capture the characteristics of different information from labeled corpus, and the word embedding models of contiguous bag of words and skip-gram are combined in the above model to respectively capture the text feature of Chinese EMR from unlabeled corpus. To evaluate the performance of above method, the tasks of NER on Chinese EMR with “medical history” content were used. Experimental results show that the word embedding bootstrapped deep active learning method using unlabeled medical corpus can achieve a better performance compared with other models. © 2021, Shanghai Jiao Tong University and Springer-Verlag GmbH Germany, part of Springer Nature.
引用
收藏
页码:494 / 502
页数:8
相关论文
共 25 条
[1]  
Ye Q., Shu T., EMR-based evaluation of medical care quality: Status quo and trends [J], Chinese Journal of Hospital Administration, 34, 7, pp. 560-563, (2018)
[2]  
Tang Q., Yuan J., Ma Q., Implementation and application of paperless filing system for medical records based on electronic signature [J], China Medical Devices, 33, 9, pp. 129-131, (2018)
[3]  
Sun W., Cai Z., Li Y., Et al., Data processing and text mining technologies on electronic medical records: a review [J], Journal of Healthcare Engineering, 2018, (2018)
[4]  
Liang H., Tsui B.Y., Ni H., Et al., Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence [J], Nature Medicine, 25, 3, pp. 433-438, (2019)
[5]  
Denis M., U.K. clinical record interactive search(cris) [J], Alzheimer’s & Dementia, 13, 7, (2017)
[6]  
Karystianis G., Nevado A.J., Kim C.H., Et al., Automatic mining of symptom severity from psychiatric evaluation notes [J], International Journal of Methods in Psychiatric Research, 27, 1, (2018)
[7]  
Cambria E., White B., Jumping NLP curves: A review of natural language processing research [J], IEEE Computational Intelligence Magazine, 9, 2, pp. 48-57, (2014)
[8]  
Yao C., Qu Y., Jin B., Et al., A convolutional neural network model for online medical guidance [J], IEEE Access, 4, pp. 4094-4103, (2016)
[9]  
Dong X., Qian L., Guan Y., Et al., A multiclass classification method based on deep learning for named entity recognition in electronic medical records [C], 2016 New York Scientific Data Summit (NYSDS), pp. 1-10, (2016)
[10]  
Hammerton J., Named entity recognition with long short-term memory [C], Proceedings of the Seventh Conference on Natural Language Learning at HLTNAACL 2003-Volume 4, pp. 172-175, (2003)