Reproducibility of artificial intelligence models in computed tomography of the head: a quantitative analysis

被引:0
作者
Felix Gunzer
Michael Jantscher
Eva M. Hassler
Thomas Kau
Gernot Reishofer
机构
[1] Medical University Graz,Division of Neuroradiology, Vascular and Interventional Radiology, Department of Radiology
[2] Know-Center GmbH,Research Center for Data
[3] Landeskrankenhaus Villach,Driven Business Big Data Analytics
[4] Medical University Graz,Department of Radiology
[5] BioTechMed Graz,Department of Radiology
来源
Insights into Imaging | / 13卷
关键词
Artificial intelligence; Head CT; Reproducibility; Epidemiology; Machine learning;
D O I
暂无
中图分类号
学科分类号
摘要
When developing artificial intelligence (AI) software for applications in radiology, the underlying research must be transferable to other real-world problems. To verify to what degree this is true, we reviewed research on AI algorithms for computed tomography of the head. A systematic review was conducted according to the preferred reporting items for systematic reviews and meta-analyses. We identified 83 articles and analyzed them in terms of transparency of data and code, pre-processing, type of algorithm, architecture, hyperparameter, performance measure, and balancing of dataset in relation to epidemiology. We also classified all articles by their main functionality (classification, detection, segmentation, prediction, triage, image reconstruction, image registration, fusion of imaging modalities). We found that only a minority of authors provided open source code (10.15%, n 0 7), making the replication of results difficult. Convolutional neural networks were predominantly used (32.61%, n = 15), whereas hyperparameters were less frequently reported (32.61%, n = 15). Data sets were mostly from single center sources (84.05%, n = 58), increasing the susceptibility of the models to bias, which increases the error rate of the models. The prevalence of brain lesions in the training (0.49 ± 0.30) and testing (0.45 ± 0.29) datasets differed from real-world epidemiology (0.21 ± 0.28), which may overestimate performances. This review highlights the need for open source code, external validation, and consideration of disease prevalence.
引用
收藏
相关论文
共 42 条
  • [1] Wang S(2012)Machine learning and radiology Med Image Anal 16 933-951
  • [2] Summers RM(2018)Current applications and future impact of machine learning in radiology Radiology 288 318-328
  • [3] Choy G(2021)Artificial intelligence in radiology: 100 commercially available products and their scientific evidence Eur Radiol 31 3797-3804
  • [4] Khalilzadeh O(2020)A systematic review of machine learning models for predicting outcomes of stroke with structured data PLoS One 15 0234722-809
  • [5] Michalski M(2020)Checklist for artificial intelligence in medical imaging (claim): a guide for authors and reviewers Radiol Artif Intell 2 e200029-15
  • [6] van Leeuwen KG(2018)Reproducibility vs. replicability: a brief history of a confused terminology Front Neuroinf 286 800-606
  • [7] Schalekamp S(2018)Methodologic guide for evaluating clinical performance and effect of artificial intelligence technology for medical diagnosis and prediction Radiology 372 71-211
  • [8] Rutten M(2021)The prisma 2020 statement: an updated guideline for reporting systematic reviews BMJ 336 924-856
  • [9] van Ginneken B(2008)Grade: an emerging consensus on rating quality of evidence and strength of recommendations BMJ 295 4-undefined
  • [10] de Rooij M(2020)Preparing medical imaging data for machine learning Radiology 1 100129-undefined