Automatic Segmentation of Clinical Texts

被引:16
作者
Apostolova, Emilia [1 ]
Channin, David S. [2 ]
Demner-Fushman, Dina [3 ]
Furst, Jacob [1 ]
Lytinen, Steven [1 ]
Raicu, Daniela [1 ]
机构
[1] Depaul Univ, Coll Comp & Digital Media, Chicago, IL 60604 USA
[2] Northwestern Univ, Sch Med, Dept Radiol, Chicago, IL 60611 USA
[3] Natl Lib Med, Commun Engn Branch, Bethesda, MD 20894 USA
来源
2009 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-20 | 2009年
关键词
D O I
10.1109/IEMBS.2009.5334831
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Clinical narratives, such as radiology and pathology reports, are commonly available in electronic form. However, they are also commonly entered and stored as free text. Knowledge of the structure of clinical narratives is necessary for enhancing the productivity of healthcare departments and facilitating research. This study attempts to automatically segment medical reports into semantic sections. Our goal is to develop a robust and scalable medical report segmentation system requiring minimum user input for efficient retrieval and extraction of information from free-text clinical narratives. Hand-crafted rules were used to automatically identify a high-confidence training set. This automatically created training dataset was later used to develop metrics and an algorithm that determines the semantic structure of the medical reports. A word-vector cosine similarity metric combined with several heuristics was used to classify each report sentence into one of several pre-defined semantic sections. This baseline algorithm achieved 79% accuracy. A Support Vector Machine (SVM) classifier trained on additional formatting and contextual features was able to achieve 90% accuracy. Plans for future work include developing a configurable system that could accommodate various medical report formatting and content standards.
引用
收藏
页码:5905 / +
页数:2
相关论文
共 15 条
  • [1] [Anonymous], 2000, NATURE STAT LEARNING, DOI DOI 10.1007/978-1-4757-3264-1
  • [2] [Anonymous], ACR PRACT GUID COMM
  • [3] Cao Hui, 2005, AMIA Annu Symp Proc, P106
  • [4] Classifying free-text triage chief complaints into syndromic categories with natural language processing
    Chapman, WW
    Christensen, LM
    Wagner, MM
    Haug, PJ
    Ivanov, O
    Dowling, JN
    Olszewski, RT
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2005, 33 (01) : 31 - 40
  • [5] Automated acquisition of disease-drug knowledge from biomedical and clinical documents: An initial study
    Chen, Elizabeth S.
    Hripcsak, George
    Xu, Hua
    Markatou, Marianthi
    Friedman, Carol
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2008, 15 (01) : 87 - 98
  • [6] CHEUNG N, 2001, STUDIES HLTH TECHNOL, P609
  • [7] Cunningham D., 2002, GATE FRAMEWORK GRAPH
  • [8] Use of computerized surveillance to detect nosocomial pneumonia in neonatal intensive care unit patients
    Haas, JP
    Mendonça, EA
    Ross, B
    Friedman, C
    Larson, E
    [J]. AMERICAN JOURNAL OF INFECTION CONTROL, 2005, 33 (08) : 439 - 443
  • [9] *M EL, 2007, AM J MANAG CARE, V13, P281
  • [10] Prospective recruitment of patients with congestive heart failure using an ad-hoc binary classifier
    Pakhomov, SV
    Buntrock, J
    Chute, CG
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2005, 38 (02) : 145 - 153