Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning

被引：7

作者：

Park, Hyung Jun ^{[1
,7
]}

Park, Namu ^{[2
]}

Lee, Jang Ho ^{[1
]}

Choi, Myeong Geun ^{[3
]}

Ryu, Jin-Sook ^{[4
]}

Song, Min ^{[5
]}

Choi, Chang-Min ^{[1
,6
]}

机构：

[1] Univ Ulsan, Coll Med, Asan Med Ctr, Dept Pulm & Crit Care Med, 88,Olymp Ro 43 Gil, Seoul 05505, South Korea

[2] Univ Washington, Sch Med, Dept Biomed Informat & Med Educ, Seattle, WA USA

[3] Ewha Womans Univ, Mokdong Hosp, Coll Med, Div Pulm & Crit Care Med,Dept Internal Med, Seoul, South Korea

[4] Univ Ulsan, Asan Med Ctr, Coll Med, Dept Nucl Med, Seoul, South Korea

[5] Yonsei Univ, Dept Digital Analyt, 50 Yonsei Ro, Seoul 03722, South Korea

[6] Univ Ulsan, Asan Med Ctr, Coll Med, Dept Oncol, Seoul, South Korea

[7] Univ Ulsan, Asan Med Ctr, Coll Med, Dept Informat Med, Seoul, South Korea

来源：

BMC MEDICAL INFORMATICS AND DECISION MAKING | 2022年 / 22卷 / 01期

关键词：

Natural language processing; Auto-annotation; Deep learning; Lung cancer; Pseudo-labelling;

D O I：

10.1186/s12911-022-01975-7

中图分类号：

R-058 [];

学科分类号：

摘要：

Background Extracting metastatic information from previous radiologic-text reports is important, however, laborious annotations have limited the usability of these texts. We developed a deep-learning model for extracting primary lung cancer sites and metastatic lymph nodes and distant metastasis information from PET-CT reports for determining lung cancer stages. Methods PET-CT reports, fully written in English, were acquired from two cohorts of patients with lung cancer who were diagnosed at a tertiary hospital between January 2004 and March 2020. One cohort of 20,466 PET-CT reports was used for training and the validation set, and the other cohort of 4190 PET-CT reports was used for an additional-test set. A pre-processing model (Lung Cancer Spell Checker) was applied to correct the typographical errors, and pseudo-labelling was used for training the model. The deep-learning model was constructed using the Convolutional-Recurrent Neural Network. The performance metrics for the prediction model were accuracy, precision, sensitivity, micro-AUROC, and AUPRC. Results For the extraction of primary lung cancer location, the model showed a micro-AUROC of 0.913 and 0.946 in the validation set and the additional-test set, respectively. For metastatic lymph nodes, the model showed a sensitivity of 0.827 and a specificity of 0.960. In predicting distant metastasis, the model showed a micro-AUROC of 0.944 and 0.950 in the validation and the additional-test set, respectively. Conclusion Our deep-learning method could be used for extracting lung cancer stage information from PET-CT reports and may facilitate lung cancer studies by alleviating laborious annotation by clinicians.

引用

页数：11

共 41 条

[41] Deep learning-based image analysis predicts PD-L1 status from 18F-FDG PET/CT images in non-small-cell lung cancer
Liang, Chen
Zheng, Meiyu
Zou, Han
Han, Yu
Zhan, Yingying
Xing, Yu
Liu, Chang
Zuo, Chao
Zou, Jinhai
FRONTIERS IN ONCOLOGY, 2024, 14

← 1 2 3 4 5 →