Natural Language Processing in Dutch Free Text Radiology Reports: Challenges in a Small Language Area Staging Pulmonary Oncology

被引:18
作者
Nobel, J. Martijn [1 ,2 ]
Puts, Sander [3 ]
Bakers, Frans C. H. [1 ]
Robben, Simon G. F. [1 ,2 ]
Dekker, Andre L. A. J. [3 ]
机构
[1] Maastricht Univ, Med Ctr, Dept Radiol & Nucl Med, Postbox 5800, NL-6202 AZ Maastricht, Netherlands
[2] Maastricht Univ, Sch Hlth Profess Educ, Maastricht, Netherlands
[3] Maastricht Univ, Med Ctr, GROW Sch Oncol & Dev Biol, Dept Radiat Oncol MAASTRO, Maastricht, Netherlands
关键词
Radiology; Reporting; Natural language processing; Free text; Classification system; Machine learning; CLASSIFICATION;
D O I
10.1007/s10278-020-00327-z
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Reports are the standard way of communication between the radiologist and the referring clinician. Efforts are made to improve this communication by, for instance, introducing standardization and structured reporting. Natural Language Processing (NLP) is another promising tool which can improve and enhance the radiological report by processing free text. NLP as such adds structure to the report and exposes the information, which in turn can be used for further analysis. This paper describes pre-processing and processing steps and highlights important challenges to overcome in order to successfully implement a free text mining algorithm using NLP tools and machine learning in a small language area, like Dutch. A rule-based algorithm was constructed to classify T-stage of pulmonary oncology from the original free text radiological report, based on the items tumor size, presence and involvement according to the 8th TNM classification system. PyContextNLP, spaCy and regular expressions were used as tools to extract the correct information and process the free text. Overall accuracy of the algorithm for evaluating T-stage was 0,83 in the training set and 0,87 in the validation set, which shows that the approach in this pilot study is promising. Future research with larger datasets and external validation is needed to be able to introduce more machine learning approaches and perhaps to reduce required input efforts of domain-specific knowledge. However, a hybrid NLP approach will probably achieve the best results.
引用
收藏
页码:1002 / 1008
页数:7
相关论文
共 50 条
  • [31] Natural Language Processing of Radiology Reports in Patients With Hepatocellular Carcinoma to Predict Radiology Resource Utilization
    Brown, A. D.
    Kachura, J. R.
    JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2019, 16 (06) : 840 - 844
  • [32] Towards automated generation of curated datasets in radiology: Application of natural language processing to unstructured reports exemplified on CT for pulmonary embolism
    Weikert, Thomas
    Nesic, Ivan
    Cyriac, Joshy
    Bremerich, Jens
    Sauter, Alexander W.
    Sommer, Gregor
    Stieltjes, Bram
    EUROPEAN JOURNAL OF RADIOLOGY, 2020, 125
  • [33] Extracting information on pneumonia in infants using natural language processing of radiology reports
    Mendonça, EA
    Haas, J
    Shagina, L
    Larson, E
    Friedman, C
    JOURNAL OF BIOMEDICAL INFORMATICS, 2005, 38 (04) : 314 - 321
  • [34] Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports
    Po-Hao Chen
    Hanna Zafar
    Maya Galperin-Aizenberg
    Tessa Cook
    Journal of Digital Imaging, 2018, 31 : 178 - 184
  • [35] Automatic detection of actionable findings and communication mentions in radiology reports using natural language processing
    Jacob J. Visser
    Marianne de Vries
    Jan A. Kors
    European Radiology, 2022, 32 : 3996 - 4002
  • [36] Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports
    Chen, Po-Hao
    Zafar, Hanna
    Galperin-Aizenberg, Maya
    Cook, Tessa
    JOURNAL OF DIGITAL IMAGING, 2018, 31 (02) : 178 - 184
  • [37] A Preliminary Study of Extracting Pulmonary Nodules and Nodule Characteristics from Radiology Reports Using Natural Language Processing
    Yang, Shuang
    Yang, Xi
    Lyu, Tianchen
    He, Xing
    Braithwaite, Dejana
    Mehta, Hiren J.
    Guo, Yi
    Wu, Yonghui
    Bian, Jiang
    2022 IEEE 10TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2022), 2022, : 618 - 619
  • [38] Evaluating Report Text Variation and Informativeness: Natural Language Processing of CT Chest Imaging for Pulmonary Embolism
    Huesch, Marco D.
    Cherian, Rekha
    Labib, Sam
    Mahraj, Rickhesvar
    JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2018, 15 (03) : 554 - 562
  • [39] Integrated natural language processing method for text mining and visualization of underground engineering text reports
    Shao, Ruiqi
    Lin, Peng
    Xu, Zhenhao
    AUTOMATION IN CONSTRUCTION, 2024, 166
  • [40] Classifying abnormalities in computed tomography radiology reports with rule-based and natural language processing models
    Han, Songyue
    Tian, James
    Kelly, Mark
    Selvakumaran, Vignesh
    Henao, Ricardo
    Rubin, Geoffrey D.
    Lo, Joseph Y.
    MEDICAL IMAGING 2019: COMPUTER-AIDED DIAGNOSIS, 2019, 10950