Natural Language Processing of Radiology Reports in Patients With Hepatocellular Carcinoma to Predict Radiology Resource Utilization

被引:18
作者
Brown, A. D. [1 ]
Kachura, J. R. [1 ]
机构
[1] Univ Toronto, Univ Hlth Network, Toronto Gen Hosp, Div Vasc & Intervent Radiol,Dept Med Imaging, Toronto, ON, Canada
关键词
Natural language processing; hepatocellular carcinoma; practice management; radiology reports;
D O I
10.1016/j.jacr.2018.12.004
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Objective: Radiology is a finite health care resource in high demand at most health centers. However, anticipating fluctuations in demand is a challenge because of the inherent uncertainty in disease prognosis. The aim of this study was to explore the potential of natural language processing (NLP) to predict downstream radiology resource utilization in patients undergoing surveillance for hepatocellular carcinoma (HCC). Materials and Methods: All HCC surveillance CT examinations performed at our institution from January 1, 2010, to October 31, 2017 were selected from our departmental radiology information system. We used open source NLP and machine learning software to parse radiology report text into bag-of-words and term frequency-inverse document frequency (TF-IDF) representations. Three machine learning models-logistic regression, support vector machine (SVM), and random forest-were used to predict future utilization of radiology department resources. A test data set was used to calculate accuracy, sensitivity, and specificity in addition to the area under the curve (AUC). Results: As a group, the bag-of-word models were slightly inferior to the TF-IDF feature extraction approach. The TF-IDF + SVM model outperformed all other models with an accuracy of 92%, a sensitivity of 83%, and a specificity of 96%, with an AUC of 0.971. Conclusions: NLP-based models can accurately predict downstream radiology resource utilization from narrative HCC surveillance reports and has potential for translation to health care management where it may improve decision making, reduce costs, and broaden access to care.
引用
收藏
页码:840 / 844
页数:5
相关论文
共 14 条
[1]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[2]   Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports [J].
Chen, Po-Hao ;
Zafar, Hanna ;
Galperin-Aizenberg, Maya ;
Cook, Tessa .
JOURNAL OF DIGITAL IMAGING, 2018, 31 (02) :178-184
[3]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[4]   Support vector machines for spam categorization [J].
Drucker, H ;
Wu, DH ;
Vapnik, VN .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (05) :1048-1054
[5]  
FRIEDMAN M, 2009, SPRINGER SERIES STAT, P119
[6]   AASLD guidelines for the treatment of hepatocellular carcinoma [J].
Heimbach, Julie K. ;
Kulik, Laura M. ;
Finn, Richard S. ;
Sirlin, Claude B. ;
Abecassis, Michael M. ;
Roberts, Lewis R. ;
Zhu, Andrew X. ;
Murad, M. Hassan ;
Marrero, Jorge A. .
HEPATOLOGY, 2018, 67 (01) :358-380
[7]   Text mining electronic hospital records to automatically classify admissions against disease: Measuring the impact of linking data sources [J].
Kocbek, Simon ;
Cavedon, Lawrence ;
Martinez, David ;
Bain, Christopher ;
Mac Manus, Chris ;
Haffari, Gholamreza ;
Zukerman, Ingrid ;
Verspoor, Karin .
JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 64 :158-167
[8]   LI-RADS (Liver Imaging Reporting and Data System): Summary, Discussion, and Consensus of the LI-RADS Management Working Group and Future Directions [J].
Mitchell, Donald G. ;
Bruix, Jordi ;
Sherman, Morris ;
Sirlin, Claude B. .
HEPATOLOGY, 2015, 61 (03) :1056-1065
[9]   Natural Language Processing in Radiology: A Systematic Review [J].
Pons, Ewoud ;
Braun, Loes M. M. ;
Hunink, M. G. Myriam ;
Kors, Jan A. .
RADIOLOGY, 2016, 279 (02) :329-343
[10]   TERM-WEIGHTING APPROACHES IN AUTOMATIC TEXT RETRIEVAL [J].
SALTON, G ;
BUCKLEY, C .
INFORMATION PROCESSING & MANAGEMENT, 1988, 24 (05) :513-523