The need to separate the wheat from the chaff in medical informatics Introducing a comprehensive checklist for the (self)-assessment of medical AI studies

被引:178
作者
Cabitza, Federico [1 ]
Campagner, Andrea [1 ]
机构
[1] Univ Milano Bicocca, DISCO, Viale Sarca 336, I-20126 Milan, Italy
关键词
Medical artificial intelligence; Machine learning; Checklist; Quality auditing; ARTIFICIAL-INTELLIGENCE; EXTERNAL VALIDATION; BIG DATA; PERFORMANCE; GUIDELINES; PROMISE; MODEL;
D O I
10.1016/j.ijmedinf.2021.104510
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This editorial aims to contribute to the current debate about the quality of studies that apply machine learning (ML) methodologies to medical data to extract value from them and provide clinicians with viable and useful tools supporting everyday care practices. We propose a practical checklist to help authors to self assess the quality of their contribution and to help reviewers to recognize and appreciate high-quality medical ML studies by distinguishing them from the mere application of ML techniques to medical data.
引用
收藏
页数:7
相关论文
共 94 条
[1]   Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis [J].
Aggarwal, Ravi ;
Sounderajah, Viknesh ;
Martin, Guy ;
Ting, Daniel S. W. ;
Karthikesalingam, Alan ;
King, Dominic ;
Ashrafian, Hutan ;
Darzi, Ara .
NPJ DIGITAL MEDICINE, 2021, 4 (01)
[2]   Sample-Size Determination Methodologies for Machine Learning in Medical Imaging Research: A Systematic Review [J].
Balki, Indranil ;
Amirabadi, Afsaneh ;
Levman, Jacob ;
Martel, Anne L. ;
Emersic, Ziga ;
Meden, Blaz ;
Garcia-Pedrero, Angel ;
Ramirez, Saul C. ;
Kong, Dehan ;
Moody, Alan R. ;
Tyrrell, Pascal N. .
CANADIAN ASSOCIATION OF RADIOLOGISTS JOURNAL-JOURNAL DE L ASSOCIATION CANADIENNE DES RADIOLOGISTES, 2019, 70 (04) :344-353
[3]   Challenges to the Reproducibility of Machine Learning Models in Health Care [J].
Beam, Andrew L. ;
Manrai, Arjun K. ;
Ghassemi, Marzyeh .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2020, 323 (04) :305-306
[4]   External validation is necessary in, prediction research: A clinical example [J].
Bleeker, SE ;
Moll, HA ;
Steyerberg, EW ;
Donders, ART ;
Derksen-Lubsen, G ;
Grobbee, DE ;
Moons, KGM .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2003, 56 (09) :826-832
[5]   Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods [J].
Borra, Simone ;
Di Ciaccio, Agostino .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (12) :2976-2989
[6]  
Bouthillier X. etal, 2021, P MACH LEARN SYST, V3
[7]   As if sand were stone. New concepts and metrics to probe the ground on which to build trustable AI [J].
Cabitza, Federico ;
Campagner, Andrea ;
Sconfienza, Luca Maria .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2020, 20 (01)
[8]   The Elephant in the Machine: Proposing a New Metric of Data Reliability and its Application to a Medical Case to Assess Classification Reliability [J].
Cabitza, Federico ;
Campagner, Andrea ;
Albano, Domenico ;
Aliprandi, Alberto ;
Bruno, Alberto ;
Chianca, Vito ;
Corazza, Angelo ;
Di Pietto, Francesco ;
Gambino, Angelo ;
Gitto, Salvatore ;
Messina, Carmelo ;
Orlandi, Davide ;
Pedone, Luigi ;
Zappia, Marcello ;
Sconfienza, Luca Maria .
APPLIED SCIENCES-BASEL, 2020, 10 (11)
[9]   Unintended Consequences of Machine Learning in Medicine [J].
Cabitza, Federico ;
Rasoini, Raffaele ;
Gensini, Gian Franco .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2017, 318 (06) :517-518
[10]  
Chan S.C., 2020, INT C LEARN REPR ADD