Detection of Pneumothorax with Deep Learning Models: Learning From Radiologist Labels vs Natural Language Processing Model Generated Labels

被引:9
作者
Hallinan, James Thomas Patrick Decourcy [1 ]
Feng, Mengling [2 ]
Ng, Dianwen [1 ,2 ]
Sia, Soon Yiew [1 ]
Tiong, Vincent Tze Yang [1 ]
Jagmohan, Pooja [1 ]
Makmur, Andrew [1 ]
Thian, Yee Liang [1 ]
机构
[1] Natl Univ Singapore Hosp, Dept Diagnost Imaging, Singapore, Singapore
[2] Natl Univ Singapore, Natl Univ Hlth Syst, Inst Data Sci, Saw Swee Hock Sch Publ Hlth,Yong Loo Lin Sch Med, Singapore, Singapore
基金
英国医学研究理事会;
关键词
Pneumothorax; Radiography; Thoracic; Neural Networks; Computer; deep learning; Natural Language Processing (NLP);
D O I
10.1016/j.acra.2021.09.013
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Rationale and Objectives: To compare the performance of pneumothorax deep learning detection models trained with radiologist versus natural language processing (NLP) labels on the NIH ChestX-ray14 dataset. Materials and Methods: The ChestX-ray14 dataset consisted of 112,120 frontal chest radiographs with 5302 positive and 106, 818 negative labels for pneumothorax using NLP (dataset A). All 112,120 radiographs were also inspected by 4 radiologists leaving a visually confirmed set of 5,138 positive and 104,751 negative for pneumothorax (dataset B). Datasets A and B were used independently to train 3 convolutional neural network (CNN) architectures (ResNet-50, DenseNet-121 and EfficientNetB3). All models' area under the receiver operating characteristic curve (AUC) were evaluated with the official NIH test set and an external test set of 525 chest radiographs from our emergency department. Results: There were significantly higher AUCs on the NIH internal test set for CNN models trained with radiologist vs NLP labels across all architectures. AUCs for the NLP/radiologist-label models were 0.838 (95%CI:0.830, 0.846)/0.881 (95%CI:0.873,0.887) for ResNet-50 (p = 0.034), 0.839 (95%CI:0.831,0.847)/0.880 (95%CI:0.873,0.887) for DenseNet-121, and 0.869 (95%CI: 0.863,0.876)/0.943 (95%CI: 0.939,0.946) for EfficientNetB3 (p <= 0.001). Evaluation with the external test set also showed higher AUCs (p <0.001) for the CNN models trained with radiologist versus NLP labels across all architectures. The AUCs for the NLP/radiologist-label models were 0.686 (95%CI:0.632,0.740)/0.806 (95%CI:0.758,0.854) for ResNet-50, 0.736 (95%CI:0.686, 0.787)/0.871 (95%CI:0.830,0.912) for DenseNet-121, and 0.822 (95%CI: 0.775,0.868)/0.915 (95%CI: 0.882,0.948) for EfficientNetB3. Conclusion: We demonstrated improved performance and generalizability of pneumothorax detection deep learning models trained with radiologist labels compared to models trained with NLP labels.
引用
收藏
页码:1350 / 1358
页数:9
相关论文
共 29 条
[1]   Comparison of Deep Learning Approaches for Multi-Label Chest X-Ray Classification [J].
Baltruschat, Ivo M. ;
Nickisch, Hannes ;
Grass, Michael ;
Knopp, Tobias ;
Saalbach, Axel .
SCIENTIFIC REPORTS, 2019, 9 (1)
[2]  
Chan YH, 2018, J HEALTHC ENG, V2018, DOI [10.1155/2018/2908517, 10.1155/2018/4595062]
[3]   COMPARING THE AREAS UNDER 2 OR MORE CORRELATED RECEIVER OPERATING CHARACTERISTIC CURVES - A NONPARAMETRIC APPROACH [J].
DELONG, ER ;
DELONG, DM ;
CLARKEPEARSON, DI .
BIOMETRICS, 1988, 44 (03) :837-845
[4]  
Goossen A, 2019, Arxiv, DOI arXiv:1907.07324
[5]  
Guendel S, 2018, Arxiv, DOI arXiv:1803.04565
[6]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[7]   Densely Connected Convolutional Networks [J].
Huang, Gao ;
Liu, Zhuang ;
van der Maaten, Laurens ;
Weinberger, Kilian Q. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269
[8]  
Irvin J, 2019, AAAI CONF ARTIF INTE, P590
[9]   MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports [J].
Johnson, Alistair E. W. ;
Pollard, Tom J. ;
Berkowitz, Seth J. ;
Greenbaum, Nathaniel R. ;
Lungren, Matthew P. ;
Deng, Chih-ying ;
Mark, Roger G. ;
Horng, Steven .
SCIENTIFIC DATA, 2019, 6 (1)
[10]  
Kim DW, 2019, KOREAN J RADIOL, V20