Section heading recognition in electronic health records using conditional random fields

被引:0
作者
Chen, Chih-Wei [1 ]
Chang, Nai-Wen [2 ,3 ]
Chang, Yung-Chun [2 ]
Dai, Hong-Jie [1 ]
机构
[1] Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University
[2] Institution of Information Science, Academia Sinica
[3] Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University
来源
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | 2014年 / 8916卷
关键词
Electronic health record; Information extraction; Natural language processing; Section recognition;
D O I
10.1007/978-3-319-13987-6_5
中图分类号
学科分类号
摘要
Electronic health records (EHRs) contain a wealth of information, such as discharge diagnoses, laboratory results, and pharmacy orders, which can be used to support clinical decision support systems and enable clinical and translational research. Unfortunately, the information is represented in a highly heterogeneous semi-structured or unstructured format with author- and domainspecific idiosyncrasies, acronyms and abbreviations. To take full advantage of health data, text-mining techniques have been applied by researchers to recognize named entities (NEs) mentioned in EHRs. However, the judgment of clinical data cannot be known solely from the NE level. For instance, a disease mention in the section of past medical history has different clinical significance when mentioned in the family medical history section. To obtain high-quality information and improve the understanding of clinical records, this work developed a machine learning-based section heading recognition system and evaluated its performance on a manually annotated corpus. The experiment results showed that the machine learning-based system achieved a satisfactory F-score of 0.939, which outperformed a dictionary-based system by 0.321. © Springer International Publishing Switzerland 2014.
引用
收藏
页码:47 / 55
页数:8
相关论文
共 8 条
  • [1] Aronson A., Effective Mapping of Biomedical Text to the UMLS Metathesaurus: The MetaMap Program, Journal of Biomedical Informatic, 35, pp. 17-21, (2001)
  • [2] Denny J.C., Miller R.A., Johnson K.B., Spickard A., Development and evaluation of a clinical note section header terminology, AMIA Annu. Symp. Proc, pp. 156-160, (2008)
  • [3] Friedman C., Shagina L., Lussier Y., Hripcsak G., Automated encoding of clinical documents based on natural language processing, J. Am. Med. Inform. Assoc, 11, 5, pp. 392-402, (2004)
  • [4] Lafferty J., McCallum A., Pereira F., Conditional random fields: Probabilistic models for segmenting and labeling sequence data, Proceedings of the 18Th International Conference on Machine Learning (ICML), pp. 282-289, (2001)
  • [5] Savova G.K., Masanz J.J., Ogren P.V., Zheng J., Sohn S., Kipper-Schuler K.C., Chute C.G., Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): Architecture, component evaluation and applications, Journal of the American Medical Informatics Association, 17, 5, pp. 507-513, (2010)
  • [6] Smith L., Rindflesch T., Wilbur W.J., MedPost: A part-of-speech tagger for bioMedical text, Bioinformatics, 20, 14, pp. 2320-2321, (2004)
  • [7] Stubbs A., Kotfila C., Xu H., Uzuner O., Practical applications for NLP in Clinical Research: The 2014 i2b2/UTHealth shared tasks, Proceedings of the I2b2 2014 Shared Task and Workshop Challenges in Natural Language Processing for Clinical Data, (2014)
  • [8] Tsai R.T., Sung C.-L., Dai H.-J., Hung H.-C., Sung T.-Y., Hsu W.-L., NERBio: Using selected word conjunctions, term normalization, and global patterns to improve biomedical named entity recognition, BMC Bioinformatics, 7, 5, (2006)