A systematic review of the diagnostic accuracy of artificial intelligence-based computer programs to analyze chest x-rays for pulmonary tuberculosis

被引:100
作者
Harris, Miriam [1 ,2 ,3 ]
Qi, Amy [2 ,4 ,5 ]
Jeagal, Luke [4 ,5 ]
Torabi, Nazi [6 ]
Menzies, Dick [1 ,4 ,5 ,7 ]
Korobitsyn, Alexei [8 ]
Pai, Madhukar [1 ,4 ,5 ,7 ]
Nathavitharana, Ruvandhi R. [9 ]
Khan, Faiz Ahmad [1 ,4 ,5 ,7 ]
机构
[1] McGill Univ, Dept Epidemiol & Biostat, Montreal, PQ, Canada
[2] McGill Univ, Hlth Ctr, Dept Med, Montreal, PQ, Canada
[3] Boston Univ, Boston Med Ctr, Dept Med, Boston, MA 02215 USA
[4] McGill Univ, Hlth Ctr, Montreal Chest Inst, Resp Epidemiol & Clin Res Unit, Montreal, PQ, Canada
[5] McGill Univ, Hlth Ctr, Res Inst, Montreal, PQ, Canada
[6] St Michaels Hosp, Li Ka Shing Int Healthcare Educ Ctr, Toronto, ON, Canada
[7] McGill Int TB Ctr, Montreal, PQ, Canada
[8] WHO, Labs Diagnost & Drug Resistance Global TB Program, Geneva, Switzerland
[9] Beth Israel Deaconess Med Ctr, Div Infect Dis, Boston, MA 02215 USA
来源
PLOS ONE | 2019年 / 14卷 / 09期
关键词
AIDED DETECTION; AUTOMATIC DETECTION; RADIOGRAPHY; QUANTIFICATION; CLASSIFICATION; COMBINATION; SETTINGS; MODELS;
D O I
10.1371/journal.pone.0221339
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We undertook a systematic review of the diagnostic accuracy of artificial intelligence-based software for identification of radiologic abnormalities (computer-aided detection, or CAD) compatible with pulmonary tuberculosis on chest x-rays (CXRs). We searched four databases for articles published between January 2005-February 2019. We summarized data on CAD type, study design, and diagnostic accuracy. We assessed risk of bias with QUADAS-2. We included 53 of the 4712 articles reviewed: 40 focused on CAD design methods ("Development" studies) and 13 focused on evaluation of CAD ("Clinical" studies). Meta-analyses were not performed due to methodological differences. Development studies were more likely to use CXR databases with greater potential for bias as compared to Clinical studies. Areas under the receiver operating characteristic curve (median AUC [IQR]) were significantly higher: in Development studies AUC: 0.88 [0.82-0.90]) versus Clinical studies (0.75 [0.66-0.87]; p-value 0.004); and with deep-learning (0.91 [0.88-0.99]) versus machine-learning (0.82 [0.75-0.89]; p = 0.001). We conclude that CAD programs are promising, but the majority of work thus far has been on development rather than clinical evaluation. We provide concrete suggestions on what study design elements should be improved.
引用
收藏
页数:19
相关论文
共 72 条
[1]   Diagnostic accuracy of digital chest radiography for pulmonary tuberculosis in a UK urban population [J].
Abubakar, I. ;
Story, A. ;
Lipman, M. ;
Bothamley, G. ;
van Hest, R. ;
Andrews, N. ;
Watson, J. M. ;
Hayward, A. .
EUROPEAN RESPIRATORY JOURNAL, 2010, 35 (03) :689-692
[2]  
Alfadhli FHO, 2017, 2017 INTERNATIONAL CONFERENCE ON ROBOTICS, AUTOMATION AND SCIENCES (ICORAS)
[3]  
[Anonymous], MATH DOIAA
[4]  
[Anonymous], SPIE MED IMAGING
[5]  
[Anonymous], MED IMAGING 2014 COM
[6]  
[Anonymous], MED IMAGING 2018 COM
[7]  
[Anonymous], P SPIE
[8]  
[Anonymous], 2016, MED IMAGING 2016 COM
[9]  
[Anonymous], IEEE T BIOMEDICAL EN
[10]  
[Anonymous], RIB SUPPRESSION CHES