Automated Detection of Radiology Reports that Require Follow-up Imaging Using Natural Language Processing Feature Engineering and Machine Learning Classification

被引:0
作者
Robert Lou
Darco Lalevic
Charles Chambers
Hanna M. Zafar
Tessa S. Cook
机构
[1] Perelman School of Medicine at the University of Pennsylvania,
[2] Hospital of the University of Pennsylvania,undefined
来源
Journal of Digital Imaging | 2020年 / 33卷
关键词
Artificial intelligence; Binary classification; Follow-up; Machine learning; Natural language processing; Structured reporting;
D O I
暂无
中图分类号
学科分类号
摘要
While radiologists regularly issue follow-up recommendations, our preliminary research has shown that anywhere from 35 to 50% of patients who receive follow-up recommendations for findings of possible cancer on abdominopelvic imaging do not return for follow-up. As such, they remain at risk for adverse outcomes related to missed or delayed cancer diagnosis. In this study, we develop an algorithm to automatically detect free text radiology reports that have a follow-up recommendation using natural language processing (NLP) techniques and machine learning models. The data set used in this study consists of 6000 free text reports from the author’s institution. NLP techniques are used to engineer 1500 features, which include the most informative unigrams, bigrams, and trigrams in the training corpus after performing tokenization and Porter stemming. On this data set, we train naive Bayes, decision tree, and maximum entropy models. The decision tree model, with an F1 score of 0.458 and accuracy of 0.862, outperforms both the naive Bayes (F1 score of 0.381) and maximum entropy (F1 score of 0.387) models. The models were analyzed to determine predictive features, with term frequency of n-grams such as “renal neoplasm” and “evalu with enhanc” being most predictive of a follow-up recommendation. Key to maximizing performance was feature engineering that extracts predictive information and appropriate selection of machine learning algorithms based on the feature set.
引用
收藏
页码:131 / 136
页数:5
相关论文
共 67 条
  • [1] Dutta S(2013)Automated detection using natural language processing of radiologists recommendations for additional imaging of incidental findings Ann Emerg Med 62 162-169
  • [2] Long WJ(2011)Automatic identification of critical follow-up recommendation sentences in radiology reports AMIA Annu Symp Proc 2011 1593-1602
  • [3] Brown DF(2017)Implementation of an automated radiology recommendation-tracking engine for abdominal imaging findings of possible cancer J Am Coll Radiol. 14 629-636
  • [4] Reisner AT(2009)Structured radiology reporting: are we there yet? Radiology. 253 23-25
  • [5] Yetisgen-Yildiz M(2012)Structured reporting: if, why, when, how-and at what expense? Results of a focus group meeting of radiology professionals from eight countries Insights Imaging. 3 295-302
  • [6] Gunn ML(2016)Natural language processing in radiology: a systematic review Radiology. 279 329-343
  • [7] Xia F(2012)Named entity recognition of follow-up and time information in 20,000 radiology reports J Am Med Inform Assoc. 19 792-799
  • [8] Payne TH(2013)A text processing pipeline to extract recommendations from radiology reports J Biomed Inform 46 354-362
  • [9] Cook TS(2015)Code abdomen: an assessment coding scheme for abdominal imaging findings possibly representing cancer J Am Coll Radiol JACR. 12 947-950
  • [10] Lalevic D(2005)Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy IEEE Transactions on Pattern Analysis and Machine Intelligence. 27 1226-1238