Deep learning for automated classification of tuberculosis-related chest X-Ray: dataset distribution shift limits diagnostic performance generalizability

被引：54

作者：

Sathitratanacheewin, Seelwan ^{[1
,2
]}

Sunanta, Panasun ^{[2
,3
]}

Pongpirul, Krit ^{[2
,4
,5
,6
,7
]}

机构：

[1] Chulalongkorn Univ, Fac Med, Dept Med, Bangkok, Thailand

[2] Thai Hlth AI Fdn, Bangkok, Thailand

[3] True Digital Grp Co Ltd, Bangkok, Thailand

[4] Chulalongkorn Univ, Fac Med, Dept Prevent & Med, Bangkok, Thailand

[5] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Int Hlth, Baltimore, MD 21205 USA

[6] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Hlth Behav & Soc, Baltimore, MD 21205 USA

[7] Bumrungrad Int Hosp, Bangkok, Thailand

来源：

HELIYON | 2020年 / 6卷 / 08期

关键词：

Computer science; Applied computing in medical science; Tuberculosis; Deep learning; Chest X-Ray;

D O I：

10.1016/j.heliyon.2020.e04614

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Background: Machine learning has been an emerging tool for various aspects of infectious diseases including tuberculosis surveillance and detection. However, the World Health Organization (WHO) provided no recommendations on using computer-aided tuberculosis detection software because of a small number of studies, methodological limitations, and limited generalizability of the findings. Methods: To quantify the generalizability of the machine-learning model, we developed a Deep Convolutional Neural Network (DCNN) model using a Tuberculosis (TB)-specific chest x-ray (CXR) dataset of one population (National Library of Medicine Shenzhen No.3 Hospital) and tested it with non-TB-specific CXR dataset of another population (National Institute of Health Clinical Centers). Results: In the training and intramural test sets using the Shenzhen hospital database, the DCCN model exhibited an AUC of 0.9845 and 0.8502 for detecting TB, respectively. However, the AUC of the supervised DCNN model in the ChestX-ray8 dataset was dramatically dropped to 0.7054. Using the cut points at 0.90, which suggested 72% sensitivity and 82% specificity in the Shenzhen dataset, the final DCNN model estimated that 36.51% of abnormal radiographs in the ChestX-ray8 dataset were related to TB. Conclusion: A supervised deep learning model developed by using the training dataset from one population may not have the same diagnostic performance in another population. Conclusion: Technical specification of CXR images, disease severity distribution, dataset distribution shift, and overdiagnosis should be examined before implementation in other settings.

引用

页数：4

共 10 条

[1]

[Anonymous], 2013, INSIGHTS IMAGING S1

[2]

[Anonymous], 2017, ARXIV171010501CS

[3] When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNs [J].

Cheng, Gong ;

Yang, Ceyuan ;

Yao, Xiwen ;

Guo, Lei ;

Han, Junwei .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (05) :2811-2821

[4] The Sensitivity and Specificity of Using a Computer Aided Diagnosis Program for Automatically Scoring Chest X-Rays of Presumptive TB Patients Compared with Xpert MTB/RIF in Lusaka Zambia [J].

Muyoyeta, Monde ;

Maduskar, Pragnya ;

Moyo, Maureen ;

Kasese, Nkatya ;

Milimo, Deborah ;

Spooner, Rosanna ;

Kapata, Nathan ;

Hogeweg, Laurens ;

van Ginneken, Bram ;

Ayles, Helen .

PLOS ONE, 2014, 9 (04)

[5]

Rajpurkar P, 2017, Arxiv, DOI arXiv:1711.05225

[6] Screening for pulmonary tuberculosis in a Tanzanian prison and computer-aided interpretation of chest X-rays [J].

Steiner, A. ;

Mangu, C. ;

van den Hombergh, J. ;

van Deutekom, H. ;

van Ginneken, B. ;

Clowes, P. ;

Mhimbira, F. ;

Mfinanga, S. ;

Rachow, A. ;

Reither, K. ;

Hoelscher, M. .

PUBLIC HEALTH ACTION, 2015, 5 (04) :249-254

[7]

Summers R.M., 2017, PROC CVPR IEEE, P2097

[8] Machine Learning for Healthcare: On the Verge of a Major Shift in Healthcare Epidemiology [J].

Wiens, Jenna ;

Shenoy, Erica S. .

CLINICAL INFECTIOUS DISEASES, 2018, 66 (01) :149-153

[9]

World Health Organization,, 2016, report no. Technical report, WHO/HTM/TB/2016.20

[10] Learning Compact and Discriminative Stacked Autoencoder for Hyperspectral Image Classification [J].

Zhou, Peicheng ;

Han, Junwei ;

Cheng, Gong ;

Zhang, Baochang .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (07) :4823-4833

← 1 →