Automatic abstraction of imaging observations with their characteristics from mammography reports

被引:25
作者
Bozkurt, Selen [1 ]
Lipson, Jafi A. [2 ]
Senol, Utku [3 ]
Rubin, Daniel L. [2 ,4 ]
机构
[1] Akdeniz Univ, Dept Biostat & Med Informat, Fac Med, Antalya, Turkey
[2] Stanford Univ, Dept Radiol, Stanford, CA 94305 USA
[3] Akdeniz Univ, Dept Radiol, Fac Med, Antalya, Turkey
[4] Stanford Univ, Dept Med Biomed Informat Res, Stanford, CA 94305 USA
关键词
RADIOLOGISTS INTERPRETATIONS; COREFERENCE RESOLUTION; INFORMATION EXTRACTION; CLINICAL-DATA; SYSTEM; VARIABILITY; SUPPORT; RETRIEVAL; DIAGNOSIS;
D O I
10.1136/amiajnl-2014-003009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Background Radiology reports are usually narrative, unstructured text, a format which hinders the ability to input report contents into decision support systems. In addition, reports often describe multiple lesions, and it is challenging to automatically extract information on each lesion and its relationships to characteristics, anatomic locations, and other information that describes it. The goal of our work is to develop natural language processing (NLP) methods to recognize each lesion in free-text mammography reports and to extract its corresponding relationships, producing a complete information frame for each lesion. Materials and methods We built an NLP information extraction pipeline in the General Architecture for Text Engineering (GATE) NLP toolkit. Sequential processing modules are executed, producing an output information frame required for a mammography decision support system. Each lesion described in the report is identified by linking it with its anatomic location in the breast. In order to evaluate our system, we selected 300 mammography reports from a hospital report database. Results The gold standard contained 797 lesions, and our system detected 815 lesions (780 true positives, 35 false positives, and 17 false negatives). The precision of detecting all the imaging observations with their modifiers was 94.9, recall was 90.9, and the F measure was 92.8. Conclusions Our NLP system extracts each imaging observation and its characteristics from mammography reports. Although our application focuses on the domain of mammography, we believe our approach can generalize to other domains and may narrow the gap between unstructured clinical report text and structured information extraction needed for data mining and decision support. Breast Imaging Reporting and Data System (BI-RADS) information extraction natural language processing imaging informatics breast
引用
收藏
页码:E81 / U246
页数:23
相关论文
共 47 条
[1]  
[Anonymous], INT WORLD WEB C ED U
[2]   An overview of MetaMap: historical perspective and recent advances [J].
Aronson, Alan R. ;
Lang, Francois-Michel .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (03) :229-236
[3]   Artificial Neural Networks in Mammography Interpretation and Diagnostic Decision Making [J].
Ayer, Turgay ;
Chen, Qiushi ;
Burnside, Elizabeth S. .
COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2013, 2013
[4]  
Bashyam Vijayaraghavan, 2005, AMIA Annu Symp Proc, P26
[5]  
Burnside E, 2000, INT CONGR SER, V1214, P449
[6]   Bayesian network to predict breast cancer risk of mammographic microcalcifications and reduce number of benign biopsy results: Initial experience [J].
Burnside, Elizabeth S. ;
Rubin, Daniel L. ;
Fine, Jason P. ;
Shachter, Ross D. ;
Sisney, Gale A. ;
Leung, Winifred K. .
RADIOLOGY, 2006, 240 (03) :666-673
[7]  
Burnside Elizabeth S, 2009, J Am Coll Radiol, V6, P851, DOI 10.1016/j.jacr.2009.07.023
[8]   Probabilistic Computer Model Developed from Clinical Data in National Mammography Database Format to Classify Mammographic Findings [J].
Burnside, Elizabeth S. ;
Davis, Jesse ;
Chhatwal, Jagpreet ;
Alagoz, Oguzhan ;
Lindstrom, Mary J. ;
Geller, Berta M. ;
Littenberg, Benjamin ;
Shaffer, Katherine A. ;
Kahn, Charles E., Jr. ;
Page, C. David .
RADIOLOGY, 2009, 251 (03) :663-672
[9]  
Chapman WW, 1999, J AM MED INFORM ASSN, P216
[10]   Discerning Tumor Status from Unstructured MRI Reports-Completeness of Information in Existing Reports and Utility of Automated Natural Language Processing [J].
Cheng, Lionel T. E. ;
Zheng, Jiaping ;
Savova, Guergana K. ;
Erickson, Bradley J. .
JOURNAL OF DIGITAL IMAGING, 2010, 23 (02) :119-132