Public Covid-19 X-ray datasets and their impact on model bias-A systematic review of a significant problem

被引:50
作者
Cruz, Beatriz Garcia Santa [1 ,2 ]
Bossa, Matias Nicolas [2 ,3 ]
Solter, Jan [2 ]
Husch, Andreas Dominik [2 ]
机构
[1] Ctr Hosp Luxembourg, 4 Rue Ernest Barble, L-1210 Luxembourg, Luxembourg
[2] Univ Luxembourg, Luxembourg Ctr Syst Biomed, 7 Ave Hauts Fourneaux, L-4362 Esch Sur Alzette, Luxembourg
[3] Vrije Univ Brussel VUB, Dept Elect & Informat ETRO, Pl Laan 2, B-1050 Brussels, Belgium
关键词
COVID-19; Machine learning; Datasets; X-Ray; Imaging; Review; Bias; Confounding; PREDICTION MODEL; APPLICABILITY; EXPLANATION; PROBAST; RISK; TOOL;
D O I
10.1016/j.media.2021.102225
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Computer-aided-diagnosis and stratification of COVID-19 based on chest X-ray suffers from weak bias assessment and limited quality-control. Undetected bias induced by inappropriate use of datasets, and improper consideration of confounders prevents the translation of prediction models into clinical practice. By adopting established tools for model evaluation to the task of evaluating datasets, this study provides a systematic appraisal of publicly available COVID-19 chest X-ray datasets, determining their potential use and evaluating potential sources of bias. Only 9 out of more than a hundred identified datasets met at least the criteria for proper assessment of risk of bias and could be analysed in detail. Remarkably most of the datasets utilised in 201 papers published in peer-reviewed journals, are not among these 9 datasets, thus leading to models with high risk of bias. This raises concerns about the suitability of such models for clinical use. This systematic review highlights the limited description of datasets employed for modelling and aids researchers to select the most suitable datasets for their task. (c) 2021 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )
引用
收藏
页数:16
相关论文
共 86 条
[41]   Extension of the CONSORT and SPIRIT statements [J].
Liu, Xiaoxuan ;
Faes, Livia ;
Calvert, Melanie J. ;
Denniston, Alastair K. .
LANCET, 2019, 394 (10205) :1225-1225
[42]  
Maguolo G., 2020, A critic evaluation of methods for covid-19 automatic detection from x-ray images
[43]   BIAS: Transparent reporting of biomedical image analysis challenges [J].
Maier-Hein, Lena ;
Reinke, Annika ;
Kozubek, Michal ;
Martel, Anne L. ;
Arbel, Tal ;
Eisenmann, Matthias ;
Hanbury, Allan ;
Jannin, Pierre ;
Mueller, Henning ;
Onogur, Sinan ;
Saez-Rodriguez, Julio ;
van Ginneken, Bram ;
Kopp-Schneider, Annette ;
Landman, Bennett A. .
MEDICAL IMAGE ANALYSIS, 2020, 66 (66)
[44]   Why rankings of biomedical image analysis competitions should be interpreted with care [J].
Maier-Hein, Lena ;
Eisenmann, Matthias ;
Reinke, Annika ;
Onogur, Sinan ;
Stankovic, Marko ;
Scholz, Patrick ;
Arbel, Tal ;
Bogunovic, Hrvoje ;
Bradley, Andrew P. ;
Carass, Aaron ;
Feldmann, Carolin ;
Frangi, Alejandro F. ;
Full, Peter M. ;
van Ginneken, Bram ;
Hanbury, Allan ;
Honauer, Katrin ;
Kozubek, Michal ;
Landman, Bennett A. ;
Marz, Keno ;
Maier, Oskar ;
Maier-Hein, Klaus ;
Menze, Bjoern H. ;
Muller, Henning ;
Neher, Peter F. ;
Niessen, Wiro ;
Rajpoot, Nasir ;
Sharp, Gregory C. ;
Sirinukunwattana, Korsuk ;
Speidel, Stefanie ;
Stock, Christian ;
Stoyanov, Danail ;
Taha, Abdel Aziz ;
Van der Sommen, Fons ;
Wang, Ching-Wei ;
Weber, Marc-Andre ;
Zheng, Guoyan ;
Jannin, Pierre ;
Kopp-Schneider, Annette .
NATURE COMMUNICATIONS, 2018, 9
[45]  
Malhotra, 2021, NOVEL ABNORMALITY AN
[46]  
Marques O., 2021, ARXIV210503020
[47]   Improving the quality of machine learning in health applications and clinical research [J].
Mateen, Bilal A. ;
Liley, James ;
Denniston, Alastair K. ;
Holmes, Chris C. ;
Vollmer, Sebastian J. .
NATURE MACHINE INTELLIGENCE, 2020, 2 (10) :554-556
[48]   Deep Learning in Radiology [J].
McBee, Morgan P. ;
Awan, Omer A. ;
Colucci, Andrew T. ;
Ghobadi, Comeron W. ;
Kadom, Nadja ;
Kansagra, Akash P. ;
Tridandapani, Srini ;
Auffermann, William F. .
ACADEMIC RADIOLOGY, 2018, 25 (11) :1472-1480
[49]   Model Cards for Model Reporting [J].
Mitchell, Margaret ;
Wu, Simone ;
Zaldivar, Andrew ;
Barnes, Parker ;
Vasserman, Lucy ;
Hutchinson, Ben ;
Spitzer, Elena ;
Raji, Inioluwa Deborah ;
Gebru, Timnit .
FAT*'19: PROCEEDINGS OF THE 2019 CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, 2019, :220-229
[50]  
Moher D, 2009, PLOS MED, V6, DOI [10.1371/journal.pmed.1000097, 10.1136/bmj.i4086, 10.1016/j.ijsu.2010.02.007, 10.1186/2046-4053-4-1, 10.1136/bmj.b2700, 10.1136/bmj.b2535, 10.1016/j.ijsu.2010.07.299]