Named Entity Recognition in Chinese Electronic Medical Records Based on CRF

被引:24
作者
Liu, Kaixin [1 ]
Hu, Qingcheng [1 ]
Liu, Jianwei [1 ]
Xing, Chunxiao [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl Lab Informat Sci & Technol, Res Inst Informat Technol, Beijing 100084, Peoples R China
来源
2017 14TH WEB INFORMATION SYSTEMS AND APPLICATIONS CONFERENCE (WISA 2017) | 2017年
关键词
electronic clinical texts; named entity recognition; CRF; EXTRACTION; ASSERTIONS;
D O I
10.1109/WISA.2017.8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Massive Electronic Medical Records (EMRs) contain a lot of knowledge and Named Entity Recognition (NER) in Chinese EMR is a very important task. However, due to the lack of Chinese medical dictionary, there are few studies on NER in Chinese EMR. In this paper, we first build a medical dictionary. We then investigated the effects of different types of features in Chinese clinical NER tasks based on Condition Random Fields (CRF) algorithm, the most popular algorithm for NER, including bag-of-characters, part of speech, dictionary feature, and word clustering features. In the experimental section, we randomly selected 220 clinical texts from Peking Anzhen Hospital. The experimental results showed that these features were beneficial in varying degrees to Chinese named entity recognition. Finally, after analyzing the experimental results, we get some rules of thumb.
引用
收藏
页码:105 / 110
页数:6
相关论文
共 18 条
  • [1] Unsupervised entity and relation extraction from clinical records in Italian
    Alicante, Anita
    Corazza, Anna
    Isgro, Francesco
    Silvestri, Stefano
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2016, 72 : 263 - 275
  • [2] [Anonymous], 2016, ARXIV161108373
  • [3] [Anonymous], 2001, PROC 18 INT C MACH L
  • [4] The Unified Medical Language System (UMLS): integrating biomedical terminology
    Bodenreider, O
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D267 - D270
  • [5] Crammer K, 2006, J MACH LEARN RES, V7, P551
  • [6] Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010
    de Bruijn, Berry
    Cherry, Colin
    Kiritchenko, Svetlana
    Martin, Joel
    Zhu, Xiaodan
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2011, 18 (05) : 557 - 562
  • [7] Ferrucci D., 2004, Natural Language Engineering, V10, P327, DOI 10.1017/S1351324904003523
  • [8] Using Local Grammar for Entity Extraction from Clinical Reports
    Ghoulam, Aicha
    Barigou, Fatiha
    Belalem, Ghalem
    Meziane, Farid
    [J]. INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2015, 3 (03): : 16 - 24
  • [9] Jain D, 2015, CLEF WORKING NOTES
  • [10] Jiang J, 2015, CLEF WORKING NOTES