Natural Language Processing in Dutch Free Text Radiology Reports: Challenges in a Small Language Area Staging Pulmonary Oncology

被引:17
|
作者
Nobel, J. Martijn [1 ,2 ]
Puts, Sander [3 ]
Bakers, Frans C. H. [1 ]
Robben, Simon G. F. [1 ,2 ]
Dekker, Andre L. A. J. [3 ]
机构
[1] Maastricht Univ, Med Ctr, Dept Radiol & Nucl Med, Postbox 5800, NL-6202 AZ Maastricht, Netherlands
[2] Maastricht Univ, Sch Hlth Profess Educ, Maastricht, Netherlands
[3] Maastricht Univ, Med Ctr, GROW Sch Oncol & Dev Biol, Dept Radiat Oncol MAASTRO, Maastricht, Netherlands
关键词
Radiology; Reporting; Natural language processing; Free text; Classification system; Machine learning; CLASSIFICATION;
D O I
10.1007/s10278-020-00327-z
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Reports are the standard way of communication between the radiologist and the referring clinician. Efforts are made to improve this communication by, for instance, introducing standardization and structured reporting. Natural Language Processing (NLP) is another promising tool which can improve and enhance the radiological report by processing free text. NLP as such adds structure to the report and exposes the information, which in turn can be used for further analysis. This paper describes pre-processing and processing steps and highlights important challenges to overcome in order to successfully implement a free text mining algorithm using NLP tools and machine learning in a small language area, like Dutch. A rule-based algorithm was constructed to classify T-stage of pulmonary oncology from the original free text radiological report, based on the items tumor size, presence and involvement according to the 8th TNM classification system. PyContextNLP, spaCy and regular expressions were used as tools to extract the correct information and process the free text. Overall accuracy of the algorithm for evaluating T-stage was 0,83 in the training set and 0,87 in the validation set, which shows that the approach in this pilot study is promising. Future research with larger datasets and external validation is needed to be able to introduce more machine learning approaches and perhaps to reduce required input efforts of domain-specific knowledge. However, a hybrid NLP approach will probably achieve the best results.
引用
收藏
页码:1002 / 1008
页数:7
相关论文
共 50 条
  • [21] Automated anonymization of radiology reports: comparison of publicly available natural language processing and large language models
    Langenbach, Marcel C.
    Foldyna, Borek
    Hadzic, Ibrahim
    Langenbach, Isabel L.
    Raghu, Vineet K.
    Lu, Michael T.
    Neilan, Tomas G.
    Heemelaar, Julius C.
    EUROPEAN RADIOLOGY, 2024, : 2634 - 2641
  • [22] Automatic detection of actionable findings and communication mentions in radiology reports using natural language processing
    Visser, Jacob J.
    de Vries, Marianne
    Kors, Jan A.
    EUROPEAN RADIOLOGY, 2022, 32 (06) : 3996 - 4002
  • [23] Automated labelling of radiology reports using natural language processing: Comparison of traditional and newer methods
    Chng, Seo Yi
    Tern, Paul J. W.
    Kan, Matthew R. X.
    Cheng, Lionel T. E.
    HEALTH CARE SCIENCE, 2023, 2 (02): : 120 - 128
  • [24] Natural Language Processing to Identify Pulmonary Nodules and Extract Nodule Characteristics From Radiology Reports
    Zheng, Chengyi
    Huang, Brian Z.
    Agazaryan, Andranik A.
    Creekmur, Beth
    Osuj, Thearis A.
    Gould, Michael K.
    CHEST, 2021, 160 (05) : 1902 - 1914
  • [25] A natural language processing pipeline for pairing measurements uniquely across free-text CT reports
    Sevenster, Merlijn
    Bozeman, Jeffrey
    Cowhy, Andrea
    Trost, William
    JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 53 : 36 - 48
  • [26] Natural Language Processing of Radiology Reports to Detect Complications of Ischemic Stroke
    Miller, Matthew, I
    Orfanoudaki, Agni
    Cronin, Michael
    Saglam, Hanife
    Kim, Ivy So Yeon
    Balogun, Oluwafemi
    Tzalidi, Maria
    Vasilopoulos, Kyriakos
    Fanaropoulou, Georgia
    Fanaropoulou, Nina M.
    Kalin, Jack
    Hutch, Meghan
    Prescott, Brenton R.
    Brush, Benjamin
    Benjamin, Emelia J.
    Shin, Min
    Mian, Asim
    Greer, David M.
    Smirnakis, Stelios M.
    Ong, Charlene J.
    NEUROCRITICAL CARE, 2022, 37 (SUPPL 2) : 291 - 302
  • [27] Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study
    Yu, Amy Y. X.
    Liu, Zhongyu A.
    Pou-Prom, Chloe
    Lopes, Kaitlyn
    Kapral, Moira K.
    Aviv, Richard, I
    Mamdani, Muhammad
    JMIR MEDICAL INFORMATICS, 2021, 9 (05)
  • [28] Natural Language Processing of Radiology Reports to Detect Complications of Ischemic Stroke
    Matthew I. Miller
    Agni Orfanoudaki
    Michael Cronin
    Hanife Saglam
    Ivy So Yeon Kim
    Oluwafemi Balogun
    Maria Tzalidi
    Kyriakos Vasilopoulos
    Georgia Fanaropoulou
    Nina M. Fanaropoulou
    Jack Kalin
    Meghan Hutch
    Brenton R. Prescott
    Benjamin Brush
    Emelia J. Benjamin
    Min Shin
    Asim Mian
    David M. Greer
    Stelios M. Smirnakis
    Charlene J. Ong
    Neurocritical Care, 2022, 37 : 291 - 302
  • [29] Natural language processing of radiology reports for identification of skeletal site-specific fractures
    Wang, Yanshan
    Mehrabi, Saeed
    Sohn, Sunghwan
    Atkinson, Elizabeth J.
    Amin, Shreyasee
    Liu, Hongfang
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (Suppl 3)
  • [30] Natural Language Processing Techniques for Extracting and Categorizing Finding Measurements in Narrative Radiology Reports
    Sevenster, M.
    Buurman, J.
    Liu, P.
    Peters, J. F.
    Chang, P. J.
    APPLIED CLINICAL INFORMATICS, 2015, 6 (03): : 600 - 610