Diagnostic test accuracy of externally validated convolutional neural network (CNN) artificial intelligence (AI) models for emergency head CT scans - A systematic review

被引:2
作者
Maenpaa, Saana M. [1 ,2 ]
Korja, Miikka [1 ]
机构
[1] Univ Helsinki, Dept Neurosurg, POB 320,Haartmaninkatu 4, Helsinki 00029, Finland
[2] Helsinki Univ Hosp, POB 320,Haartmaninkatu 4, Helsinki 00029, Finland
关键词
Artificial intelligence; Deep learning; Convolutional neural network; Emergency Head Computed Tomography (CT); Computer-Aided Diagnosis (CADx); PREDICTION MODEL; TOOL; APPLICABILITY; EXPLANATION; SENSITIVITY; QUALITY; PROBAST; BIAS; RISK;
D O I
10.1016/j.ijmedinf.2024.105523
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Background: The surge in emergency head CT imaging and artificial intelligence (AI) advancements, especially deep learning (DL) and convolutional neural networks (CNN), have accelerated the development of computeraided diagnosis (CADx) for emergency imaging. External validation assesses model generalizability, providing preliminary evidence of clinical potential. Objectives: This study systematically reviews externally validated CNN-CADx models for emergency head CT scans, critically appraises diagnostic test accuracy (DTA), and assesses adherence to reporting guidelines. Methods: Studies comparing CNN-CADx model performance to reference standard were eligible. The review was registered in PROSPERO (CRD42023411641) and conducted on Medline, Embase, EBM-Reviews and Web of Science following PRISMA-DTA guideline. DTA reporting were systematically extracted and appraised using standardised checklists (STARD, CHARMS, CLAIM, TRIPOD, PROBAST, QUADAS-2). Results: Six of 5636 identified studies were eligible. The common target condition was intracranial haemorrhage (ICH), and intended workflow roles auxiliary to experts. Due to methodological and clinical between -study variation, meta -analysis was inappropriate. The scan -level sensitivity exceeded 90 % in 5/6 studies, while specificities ranged from 58,0 -97,7 %. The SROC 95 % predictive region was markedly broader than the confidence region, ranging above 50 % sensitivity and 20 % specificity. All studies had unclear or high risk of bias and concern for applicability (QUADAS-2, PROBAST), and reporting adherence was below 50 % in 20 of 32 TRIPOD items. Conclusion: 0.01 % of identified studies met the eligibility criteria. The evidence on the DTA of CNN-CADx models for emergency head CT scans remains limited in the scope of this review, as the reviewed studies were scarce, inapt for meta -analysis and undermined by inadequate methodological conduct and reporting. Properly conducted, external validation remains preliminary for evaluating the clinical potential of AI-CADx models, but prospective and pragmatic clinical validation in comparative trials remains most crucial. In conclusion, future AI-CADx research processes should be methodologically standardized and reported in a clinically meaningful way to avoid research waste.
引用
收藏
页数:11
相关论文
共 39 条
[1]   Evaluating medical tests: introducing the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy [J].
Bossuyt, Patrick M. ;
Deeks, Jonathan J. ;
Leeflang, Mariska M. ;
Takwoingi, Yemisi ;
Flemyng, Ella .
COCHRANE DATABASE OF SYSTEMATIC REVIEWS, 2023, (07)
[2]   An analysis reveals differences between pragmatic and explanatory diagnostic accuracy studies [J].
Bossuyt, Patrick M. ;
Olsen, Maria ;
Hyde, Chris ;
Cohen, Jeremie F. .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2020, 117 :29-35
[3]   Workload for radiologists during on-call hours: dramatic increase in the past 15 years [J].
Bruls, R. J. M. ;
Kwee, R. M. .
INSIGHTS INTO IMAGING, 2020, 11 (01)
[4]   Recommendations for the development and use of imaging test sets to investigate the test performance of artificial intelligence in health screening [J].
Chalkidou, Anastasia ;
Shokraneh, Farhad ;
Kijauskaite, Goda ;
Taylor-Phillips, Sian ;
Halligan, Steve ;
Wilkinson, Louise ;
Glocker, Ben ;
Garrett, Peter ;
Denniston, Alastair K. ;
Mackie, Anne ;
Seedat, Farah .
LANCET DIGITAL HEALTH, 2022, 4 (12) :E899-E905
[5]   Hybrid 3D/2D Convolutional Neural Network for Hemorrhage Evaluation on Head CT [J].
Chang, P. D. ;
Kuoy, E. ;
Grinband, J. ;
Weinberg, B. D. ;
Thompson, M. ;
Homo, R. ;
Chen, J. ;
Abcede, H. ;
Shafie, M. ;
Sugrue, L. ;
Filippi, C. G. ;
Su, M. -Y. ;
Yu, W. ;
Hess, C. ;
Chow, D. .
AMERICAN JOURNAL OF NEURORADIOLOGY, 2018, 39 (09) :1609-1616
[6]   STARD 2015 guidelines for reporting diagnostic accuracy studies: explanation and elaboration [J].
Cohen, Jeremie F. ;
Korevaar, Daniel A. ;
Altman, Douglas G. ;
Bruns, David E. ;
Gatsonis, Constantine A. ;
Hooft, Lotty ;
Irwig, Les ;
Levine, Deborah ;
Reitsma, Johannes B. ;
de Vet, Henrica C. W. ;
Bossuyt, Patrick M. M. .
BMJ OPEN, 2016, 6 (11)
[7]   Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence [J].
Collins, Gary S. ;
Dhiman, Paula ;
Andaur Navarro, Constanza L. ;
Ma, Ji ;
Hooft, Lotty ;
Reitsma, Johannes B. ;
Logullo, Patricia ;
Beam, Andrew L. ;
Peng, Lily ;
Van Calster, Ben ;
van Smeden, Maarten ;
Riley, Richard D. ;
Moons, Karel G. M. .
BMJ OPEN, 2021, 11 (07)
[8]  
Deeks J. J., 2022, Cochrane Handbook for Systematic Reviews of Interventions Version 6.3
[9]   Methodological conduct of prognostic prediction models developed using machine learning in oncology: a systematic review [J].
Dhiman, Paula ;
Ma, Jie ;
Navarro, Constanza L. Andaur ;
Speich, Benjamin ;
Bullock, Garrett ;
Damen, Johanna A. A. ;
Hooft, Lotty ;
Kirtley, Shona ;
Riley, Richard D. ;
Van Calster, Ben ;
Moons, Karel G. M. ;
Collins, Gary S. .
BMC MEDICAL RESEARCH METHODOLOGY, 2022, 22 (01)
[10]   Mitigating Bias in Radiology Machine Learning: 3. Performance Metrics [J].
Faghani, Shahriar ;
Khosravi, Bardia ;
Zhang, Kuan ;
Moassefi, Mana ;
Jagtap, Jaidip Manikrao ;
Nugen, Fred ;
Vahdati, Sanaz ;
Kuanar, Shiba P. ;
Rassoulinejad-Mousavi, Seyed Moein ;
Singh, Yashbir ;
Garcia, Diana V. Vera ;
Rouzrokh, Pouria ;
Erickson, Bradley J. .
RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2022, 4 (05)