Named Entity Recognition in Chinese Electronic Medical Records Based on CRF

被引:24
作者
Liu, Kaixin [1 ]
Hu, Qingcheng [1 ]
Liu, Jianwei [1 ]
Xing, Chunxiao [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl Lab Informat Sci & Technol, Res Inst Informat Technol, Beijing 100084, Peoples R China
来源
2017 14TH WEB INFORMATION SYSTEMS AND APPLICATIONS CONFERENCE (WISA 2017) | 2017年
关键词
electronic clinical texts; named entity recognition; CRF; EXTRACTION; ASSERTIONS;
D O I
10.1109/WISA.2017.8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Massive Electronic Medical Records (EMRs) contain a lot of knowledge and Named Entity Recognition (NER) in Chinese EMR is a very important task. However, due to the lack of Chinese medical dictionary, there are few studies on NER in Chinese EMR. In this paper, we first build a medical dictionary. We then investigated the effects of different types of features in Chinese clinical NER tasks based on Condition Random Fields (CRF) algorithm, the most popular algorithm for NER, including bag-of-characters, part of speech, dictionary feature, and word clustering features. In the experimental section, we randomly selected 220 clinical texts from Peking Anzhen Hospital. The experimental results showed that these features were beneficial in varying degrees to Chinese named entity recognition. Finally, after analyzing the experimental results, we get some rules of thumb.
引用
收藏
页码:105 / 110
页数:6
相关论文
共 18 条
  • [11] A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries
    Jiang, Min
    Chen, Yukun
    Liu, Mei
    Rosenbloom, S. Trent
    Mani, Subramani
    Denny, Joshua C.
    Xu, Hua
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2011, 18 (05) : 601 - 606
  • [12] Kang N., 2010, P 2010 I2B2 VA WORKS
  • [13] A comprehensive study of named entity recognition in Chinese clinical text
    Lei, Jianbo
    Tang, Buzhou
    Lu, Xueqin
    Gao, Kaihua
    Jiang, Min
    Xu, Hua
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2014, 21 (05) : 808 - 814
  • [14] Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications
    Savova, Guergana K.
    Masanz, James J.
    Ogren, Philip V.
    Zheng, Jiaping
    Sohn, Sunghwan
    Kipper-Schuler, Karin C.
    Chute, Christopher G.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (05) : 507 - 513
  • [15] 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text
    Uzuner, Oezlem
    South, Brett R.
    Shen, Shuying
    DuVall, Scott L.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2011, 18 (05) : 552 - 556
  • [16] Electronic Medical Records (EMRs), Epidemiology, and Epistemology: Reflections on EMRs and Future Pediatric Clinical Research
    Wasserman, Richard C.
    [J]. ACADEMIC PEDIATRICS, 2011, 11 (04) : 280 - 287
  • [17] Joint segmentation and named entity recognition using dual decomposition in Chinese discharge summaries
    Xu, Yan
    Wang, Yining
    Liu, Tianren
    Liu, Jiahua
    Fan, Yubo
    Qian, Yi
    Tsujii, Junichi
    Chang, Eric I.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2014, 21 (E1) : E84 - E92
  • [18] Ye Feng, 2011, Chinese Journal of Biomedical Engineering, V30, P256, DOI 10.3969/j.issn.0258-8021.2011.02.014