Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning

被引:7
|
作者
Park, Hyung Jun [1 ,7 ]
Park, Namu [2 ]
Lee, Jang Ho [1 ]
Choi, Myeong Geun [3 ]
Ryu, Jin-Sook [4 ]
Song, Min [5 ]
Choi, Chang-Min [1 ,6 ]
机构
[1] Univ Ulsan, Coll Med, Asan Med Ctr, Dept Pulm & Crit Care Med, 88,Olymp Ro 43 Gil, Seoul 05505, South Korea
[2] Univ Washington, Sch Med, Dept Biomed Informat & Med Educ, Seattle, WA USA
[3] Ewha Womans Univ, Mokdong Hosp, Coll Med, Div Pulm & Crit Care Med,Dept Internal Med, Seoul, South Korea
[4] Univ Ulsan, Asan Med Ctr, Coll Med, Dept Nucl Med, Seoul, South Korea
[5] Yonsei Univ, Dept Digital Analyt, 50 Yonsei Ro, Seoul 03722, South Korea
[6] Univ Ulsan, Asan Med Ctr, Coll Med, Dept Oncol, Seoul, South Korea
[7] Univ Ulsan, Asan Med Ctr, Coll Med, Dept Informat Med, Seoul, South Korea
关键词
Natural language processing; Auto-annotation; Deep learning; Lung cancer; Pseudo-labelling;
D O I
10.1186/s12911-022-01975-7
中图分类号
R-058 [];
学科分类号
摘要
Background Extracting metastatic information from previous radiologic-text reports is important, however, laborious annotations have limited the usability of these texts. We developed a deep-learning model for extracting primary lung cancer sites and metastatic lymph nodes and distant metastasis information from PET-CT reports for determining lung cancer stages. Methods PET-CT reports, fully written in English, were acquired from two cohorts of patients with lung cancer who were diagnosed at a tertiary hospital between January 2004 and March 2020. One cohort of 20,466 PET-CT reports was used for training and the validation set, and the other cohort of 4190 PET-CT reports was used for an additional-test set. A pre-processing model (Lung Cancer Spell Checker) was applied to correct the typographical errors, and pseudo-labelling was used for training the model. The deep-learning model was constructed using the Convolutional-Recurrent Neural Network. The performance metrics for the prediction model were accuracy, precision, sensitivity, micro-AUROC, and AUPRC. Results For the extraction of primary lung cancer location, the model showed a micro-AUROC of 0.913 and 0.946 in the validation set and the additional-test set, respectively. For metastatic lymph nodes, the model showed a sensitivity of 0.827 and a specificity of 0.960. In predicting distant metastasis, the model showed a micro-AUROC of 0.944 and 0.950 in the validation and the additional-test set, respectively. Conclusion Our deep-learning method could be used for extracting lung cancer stage information from PET-CT reports and may facilitate lung cancer studies by alleviating laborious annotation by clinicians.
引用
收藏
页数:11
相关论文
共 41 条
  • [31] Evaluating the accuracy of lung-RADS score extraction from radiology reports: Manual entry versus natural language processing
    Gandomi, Amir
    Hasan, Eusha
    Chusid, Jesse
    Paul, Subroto
    Inra, Matthew
    Makhnevich, Alex
    Raoof, Suhail
    Silvestri, Gerard
    Bade, Brett C.
    Cohen, Stuart L.
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2024, 191
  • [32] An [18F]FDG-PET/CT deep learning method for fully automated detection of pathological mediastinal lymph nodes in lung cancer patients
    David Wallis
    Michaël Soussan
    Maxime Lacroix
    Pia Akl
    Clément Duboucher
    Irène Buvat
    European Journal of Nuclear Medicine and Molecular Imaging, 2022, 49 : 881 - 888
  • [33] An [18F]FDG-PET/CT deep learning method for fully automated detection of pathological mediastinal lymph nodes in lung cancer patients
    Wallis, David
    Soussan, Michael
    Lacroix, Maxime
    Akl, Pia
    Duboucher, Clement
    Buvat, Irene
    EUROPEAN JOURNAL OF NUCLEAR MEDICINE AND MOLECULAR IMAGING, 2022, 49 (03) : 881 - 888
  • [34] A novel staging system derived from natural language processing of pathology reports to predict prognostic outcomes of pancreatic cancer: a retrospective cohort study
    Li, Bo
    Wang, Beilei
    Zhuang, Pengjie
    Cao, Hongwei
    Wu, Shengyong
    Tan, Zhendong
    Gao, Suizhi
    Li, Penghao
    Jing, Wei
    Shao, Zhuo
    Zheng, Kailian
    Wu, Lele
    Gao, Bai
    Wang, Yang
    Jiang, Hui
    Guo, Shiwei
    He, Liang
    Yang, Yan
    Jin, Gang
    INTERNATIONAL JOURNAL OF SURGERY, 2023, 109 (11) : 3476 - 3489
  • [35] Automated classification of cancer morphology from Italian pathology reports using Natural Language Processing techniques: A rule-based approach
    Lindaa, Hammami
    Alessia, Paglialonga
    Giancarlo, Pruneri
    Michele, Torresani
    Milenaa, Sant
    Carlo, Bono
    Gianluca, Caiani Enrico
    Paolo, Baili
    JOURNAL OF BIOMEDICAL INFORMATICS, 2021, 116
  • [36] Deep-Learning-Based Natural Language Processing of Serial Free-Text Radiological Reports for Predicting Rectal Cancer Patient Survival
    Kim, Sunkyu
    Lee, Choong-kun
    Choi, Yonghwa
    Baek, Eun Sil
    Choi, Jeong Eun
    Lim, Joon Seok
    Kang, Jaewoo
    Shin, Sang Joon
    FRONTIERS IN ONCOLOGY, 2021, 11
  • [37] Deep Learning-Based Feature Extraction from Whole-Body PET/CT Employing Maximum Intensity Projection Images: Preliminary Results of Lung Cancer Data
    Joonhyung Gil
    Hongyoon Choi
    Jin Chul Paeng
    Gi Jeong Cheon
    Keon Wook Kang
    Nuclear Medicine and Molecular Imaging, 2023, 57 : 216 - 222
  • [38] Deep Learning-Based Feature Extraction from Whole-Body PET/CT Employing Maximum Intensity Projection Images: Preliminary Results of Lung Cancer Data
    Gil, Joonhyung
    Choi, Hongyoon
    Paeng, Jin Chul
    Cheon, Gi Jeong
    Kang, Keon Wook
    NUCLEAR MEDICINE AND MOLECULAR IMAGING, 2023, 57 (05) : 216 - 222
  • [39] Development and Validation of a Modified Three-Dimensional U-Net Deep-Learning Model for Automated Detection of Lung Nodules on Chest CT Images From the Lung Image Database Consortium and Japanese Datasets
    Suzuki, Kazuhiro
    Otsuka, Yujiro
    Nomura, Yukihiro
    Kumamaru, Kanako K.
    Kuwatsuru, Ryohei
    Aoki, Shigeki
    ACADEMIC RADIOLOGY, 2022, 29 : S11 - S17
  • [40] Segmentation-Free Outcome Prediction from Head and Neck Cancer PET/CT Images: Deep Learning-Based Feature Extraction from Multi-Angle Maximum Intensity Projections (MA-MIPs)
    Toosi, Amirhosein
    Shiri, Isaac
    Zaidi, Habib
    Rahmim, Arman
    CANCERS, 2024, 16 (14)