Why rankings of biomedical image analysis competitions should be interpreted with care

被引:202
作者
Maier-Hein, Lena [1 ]
Eisenmann, Matthias [1 ]
Reinke, Annika [1 ]
Onogur, Sinan [1 ]
Stankovic, Marko [1 ]
Scholz, Patrick [1 ]
Arbel, Tal [2 ]
Bogunovic, Hrvoje [3 ]
Bradley, Andrew P. [4 ]
Carass, Aaron [5 ]
Feldmann, Carolin [1 ]
Frangi, Alejandro F. [6 ]
Full, Peter M. [1 ]
van Ginneken, Bram [7 ]
Hanbury, Allan [8 ,9 ]
Honauer, Katrin [10 ]
Kozubek, Michal [11 ]
Landman, Bennett A. [12 ]
Marz, Keno [1 ]
Maier, Oskar [13 ]
Maier-Hein, Klaus [14 ]
Menze, Bjoern H. [15 ]
Muller, Henning [16 ]
Neher, Peter F. [14 ]
Niessen, Wiro [17 ]
Rajpoot, Nasir [18 ,19 ]
Sharp, Gregory C. [20 ]
Sirinukunwattana, Korsuk [21 ]
Speidel, Stefanie [22 ]
Stock, Christian [23 ]
Stoyanov, Danail [24 ]
Taha, Abdel Aziz [25 ,26 ]
Van der Sommen, Fons [27 ]
Wang, Ching-Wei [28 ]
Weber, Marc-Andre [29 ]
Zheng, Guoyan [30 ]
Jannin, Pierre [31 ]
Kopp-Schneider, Annette [32 ,33 ]
机构
[1] German Canc Res Ctr, Div Comp Assisted Med Intervent CAMI, D-69120 Heidelberg, Germany
[2] McGill Univ, Ctr Intelligent Machines, Montreal, PQ H3A 0G4, Canada
[3] Med Univ Vienna, Dept Ophthalmol, Christian Doppler Lab Ophthalm Image Anal, A-1090 Vienna, Austria
[4] Queensland Univ Technol, Sci & Engn Fac, Brisbane, Qld 4001, Australia
[5] Johns Hopkins Univ, Dept Comp Sci, Dept Elect & Comp Engn, Baltimore, MD 21218 USA
[6] Univ Leeds, CISTIB Ctr Computat Imaging & Simulat Technol Bio, Leeds LS2 9JT, W Yorkshire, England
[7] Radboud Univ Nijmegen, Dept Radiol & Nucl Med, Med Image Anal, NL-6525 GA Nijmegen, Netherlands
[8] TU Wien, Inst Informat Syst Engn, A-1040 Vienna, Austria
[9] Complex Sci Hub Vienna, A-1080 Vienna, Austria
[10] Heidelberg Univ, Heidelberg Collaboratory Image Proc HCI, D-69120 Heidelberg, Germany
[11] Masaryk Univ, Ctr Biomed Image Anal, Brno 60200, Czech Republic
[12] Vanderbilt Univ, Elect Engn, Nashville, TN 37235 USA
[13] Univ Lubeck, Inst Med Informat, D-23562 Lubeck, Germany
[14] German Canc Res Ctr, Div Med Image Comp MIC, D-69120 Heidelberg, Germany
[15] Tech Univ Munich, Dept Informat, Inst Adv Studies, D-80333 Munich, Germany
[16] HES SO, Informat Syst Inst, CH-3960 Sierre, Switzerland
[17] Erasmus MC, Dept Radiol, NL-3015 GD Rotterdam, Netherlands
[18] Erasmus MC, Dept Nucl Med, NL-3015 GD Rotterdam, Netherlands
[19] Erasmus MC, Dept Med Informat, NL-3015 GD Rotterdam, Netherlands
[20] Univ Warwick, Dept Comp Sci, Coventry CV4 7AL, W Midlands, England
[21] Massachusetts Gen Hosp, Dept Radiat Oncol, Boston, MA 02114 USA
[22] Univ Oxford, Inst Biomed Engn, Oxford OX3 7DQ, England
[23] Natl Ctr Tumor Dis Dresden, Div Translat Surg Oncol TCO, D-01307 Dresden, Germany
[24] German Canc Res Ctr, Div Clin Epidemiol & Aging Res, D-69120 Heidelberg, Germany
[25] UCL, CMIC, London W1W 7TS, England
[26] UCL, Dept Comp Sci, London W1W 7TS, England
[27] Res Studios Austria FG, Data Sci Studio, A-1090 Vienna, Austria
[28] Eindhoven Univ Technol, Dept Elect Engn, NL-5600 MB Eindhoven, Netherlands
[29] Natl Taiwan Univ Sci & Technol, Grad Inst Biomed Engn, AIExplore, NTUST Ctr Comp Vis & Med Imaging, Taipei 106, Taiwan
[30] Univ Med Ctr Rostock, Inst Diagnost & Intervent Radiol, D-18051 Rostock, Germany
[31] Univ Bern, Inst Surg Technol & Biomech, CH-3014 Bern, Switzerland
[32] Univ Rennes, INSERM, LTSI UMR S 1099, F-35043 Rennes, France
[33] German Canc Res Ctr, Div Biostat, D-69120 Heidelberg, Germany
基金
澳大利亚研究理事会; 欧洲研究理事会; 英国医学研究理事会; 英国工程与自然科学研究理事会; 英国惠康基金; 瑞士国家科学基金会;
关键词
COMPUTED-TOMOGRAPHY SCANS; SEGMENTATION ALGORITHMS; STANDARDIZED EVALUATION; EVALUATION FRAMEWORK; LESION SEGMENTATION; LUMEN SEGMENTATION; PULMONARY NODULES; VALIDATION; TRACKING; MRI;
D O I
10.1038/s41467-018-07619-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
International challenges have become the standard for validation of biomedical image analysis methods. Given their scientific impact, it is surprising that a critical analysis of common practices related to the organization of challenges has not yet been performed. In this paper, we present a comprehensive analysis of biomedical image analysis challenges conducted up to now. We demonstrate the importance of challenges and show that the lack of quality control has critical consequences. First, reproducibility and interpretation of the results is often hampered as only a fraction of relevant information is typically provided. Second, the rank of an algorithm is generally not robust to a number of variables such as the test data used for validation, the ranking scheme applied and the observers that make the reference annotations. To overcome these problems, we recommend best practice guidelines and define open research questions to be addressed in the future.
引用
收藏
页数:13
相关论文
共 65 条
[31]   Cloud-Based Evaluation of Anatomical Structure Segmentation and Landmark Detection Algorithms: VISCERAL Anatomy Benchmarks [J].
Jimenez-del-Toro, Oscar ;
Muller, Henning ;
Krenn, Markus ;
Gruenberg, Katharina ;
Taha, Abdel Aziz ;
Winterstein, Marianne ;
Eggel, Ivan ;
Foncubierta-Rodriguez, Antonio ;
Goksel, Orcun ;
Jakab, Andres ;
Kontokotsios, Georgios ;
Langs, Georg ;
Menze, Bjoern H. ;
Fernandez, Tomas Salas ;
Schaer, Roger ;
Walleyo, Anna ;
Weber, Marc-Andre ;
Cid, Yashin Dicente ;
Gass, Tobias ;
Heinrich, Mattias ;
Jia, Fucang ;
Kahl, Fredrik ;
Kechichian, Razmig ;
Mai, Dominic ;
Spanier, Assaf B. ;
Vincent, Graham ;
Wang, Chunliang ;
Wyeth, Daniel ;
Hanbury, Allan .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2016, 35 (11) :2459-2475
[32]   Evaluating performance of biomedical image retrieval systems An overview of the medical image retrieval task at ImageCLEF 2004-2013 [J].
Kalpathy-Cramer, Jayashree ;
de Herrera, Alba Garcia Seco ;
Demner-Fushman, Dina ;
Antani, Sameer ;
Bedrick, Steven ;
Mueller, Henning .
COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 39 :55-61
[33]   Evaluation of state-of-the-art segmentation algorithms for left ventricle infarct from late Gadolinium enhancement MR images [J].
Karim, Rashed ;
Bhagirath, Pranav ;
Claus, Piet ;
Housden, R. James ;
Chen, Zhong ;
Karimaghaloo, Zahra ;
Sohn, Hyon-Mok ;
Lara Rodriguez, Laura ;
Vera, Sergio ;
Alba, Xenia ;
Hennemuth, Anja ;
Peitgen, Heinz-Otto ;
Arbel, Tal ;
Gonzalez Ballester, Miguel A. ;
Frangi, Alejandro F. ;
Gotte, Marco ;
Razavi, Reza ;
Schaeffter, Tobias ;
Rhode, Kawal .
MEDICAL IMAGE ANALYSIS, 2016, 30 :95-107
[34]   A new measure of rank correlation [J].
Kendall, MG .
BIOMETRIKA, 1938, 30 :81-93
[35]   Standardized evaluation framework for evaluating coronary artery stenosis detection, stenosis quantification and lumen segmentation algorithms in computed tomography angiography [J].
Kirisli, H. A. ;
Schaap, M. ;
Metz, C. T. ;
Dharampal, A. S. ;
Meijboom, W. B. ;
Papadopoulou, S. L. ;
Dedic, A. ;
Nieman, K. ;
de Graaf, M. A. ;
Meijs, M. F. L. ;
Cramer, M. J. ;
Broersen, A. ;
Cetin, S. ;
Eslami, A. ;
Florez-Valencia, L. ;
Lor, K. L. ;
Matuszewski, B. ;
Melki, I. ;
Mohr, B. ;
Oksuz, I. ;
Shahzad, R. ;
Wang, C. ;
Kitslaar, P. H. ;
Unal, G. ;
Katouzian, A. ;
Orkisz, M. ;
Chen, C. M. ;
Precioso, F. ;
Najman, L. ;
Masood, S. ;
Unay, D. ;
Van Vliet, L. ;
Moreno, R. ;
Goldenberg, R. ;
Vucini, E. ;
Krestin, G. P. ;
Niessen, W. J. ;
van Walsum, T. .
MEDICAL IMAGE ANALYSIS, 2013, 17 (08) :859-876
[36]   Crowdsourced analysis of clinical trial data to predict amyotrophic lateral sclerosis progression [J].
Kueffner, Robert ;
Zach, Neta ;
Norel, Raquel ;
Hawe, Johann ;
Schoenfeld, David ;
Wang, Liuxia ;
Li, Guang ;
Fang, Lilly ;
Mackey, Lester ;
Hardiman, Orla ;
Cudkowicz, Merit ;
Sherman, Alexander ;
Ertaylan, Gokhan ;
Grosse-Wentrup, Moritz ;
Hothorn, Torsten ;
van Ligtenberg, Jules ;
Macke, Jakob H. ;
Meyer, Timm ;
Schoelkopf, Bernhard ;
Tran, Linh ;
Vaughan, Rubio ;
Stolovitzky, Gustavo ;
Leitner, Melanie L. .
NATURE BIOTECHNOLOGY, 2015, 33 (01) :51-U292
[37]  
LANGVILLE A. N., 2012, Who's# 1? The Science of Rating and Ranking
[38]   Evaluation of prostate segmentation algorithms for MRI: The PROMISE12 challenge [J].
Litjens, Geert ;
Toth, Robert ;
van de Ven, Wendy ;
Hoeks, Caroline ;
Kerkstra, Sjoerd ;
van Ginneken, Bram ;
Vincent, Graham ;
Guillard, Gwenael ;
Birbeck, Neil ;
Zhang, Jindang ;
Strand, Robin ;
Malmberg, Filip ;
Ou, Yangming ;
Davatzikos, Christos ;
Kirschner, Matthias ;
Jung, Florian ;
Yuan, Jing ;
Qiu, Wu ;
Gao, Qinquan ;
Edwards, Philip Eddie ;
Maan, Bianca ;
van der Heijden, Ferdinand ;
Ghose, Soumya ;
Mitra, Jhimli ;
Dowling, Jason ;
Barratt, Dean ;
Huisman, Henkjan ;
Madabhushi, Anant .
MEDICAL IMAGE ANALYSIS, 2014, 18 (02) :359-373
[39]   Extraction of Airways From CT (EXACT'09) [J].
Lo, Pechin ;
van Ginneken, Bram ;
Reinhardt, Joseph M. ;
Yavarna, Tarunashree ;
de Jong, Pim A. ;
Irving, Benjamin ;
Fetita, Catalin ;
Ortner, Margarete ;
Pinho, Romulo ;
Sijbers, Jan ;
Feuerstein, Marco ;
Fabijanska, Anna ;
Bauer, Christian ;
Beichel, Reinhard ;
Mendoza, Carlos S. ;
Wiemker, Rafael ;
Lee, Jaesung ;
Reeves, Anthony P. ;
Born, Silvia ;
Weinheimer, Oliver ;
van Rikxoort, Eva M. ;
Tschirren, Juerg ;
Mori, Ken ;
Odry, Benjamin ;
Naidich, David P. ;
Hartmann, Ieneke ;
Hoffman, Eric A. ;
Prokop, Mathias ;
Pedersen, Jesper H. ;
de Bruijne, Marleen .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2012, 31 (11) :2093-2107
[40]   ISLES 2015-A public evaluation benchmark for ischemic stroke lesion segmentation from multispectral MRI [J].
Maier, Oskar ;
Menze, Bjoern H. ;
von der Gablentz, Janina ;
Hani, Levin ;
Heinrich, Mattias P. ;
Liebrand, Matthias ;
Winzeck, Stefan ;
Basit, Abdul ;
Bentley, Paul ;
Chen, Liang ;
Christiaens, Daan ;
Dutil, Francis ;
Egger, Karl ;
Feng, Chaolu ;
Glocker, Ben ;
Goetz, Michael ;
Haeck, Tom ;
Halme, Hanna-Leena ;
Havaei, Mohammad ;
Iftekharuddin, Khan M. ;
Jodoin, Pierre-Marc ;
Kamnitsas, Konstantinos ;
Kellner, Elias ;
Korvenoja, Antti ;
Larochelle, Hugo ;
Ledig, Christian ;
Lee, Jia-Hong ;
Maes, Frederik ;
Mahmood, Qaiser ;
Maier-Hein, Klaus H. ;
McKinley, Richard ;
Muschelli, John ;
Pal, Chris ;
Pei, Linmin ;
Rangarajan, Janaki Raman ;
Reza, Syed M. S. ;
Robben, David ;
Rueckert, Daniel ;
Salli, Eero ;
Suetens, Paul ;
Wang, Ching-Wei ;
Wilms, Matthias ;
Kirschke, Jan S. ;
Kraemer, Ulrike M. ;
Muente, Thomas F. ;
Schramme, Peter ;
Wiest, Roland ;
Handels, Heinz ;
Reyes, Mauricio .
MEDICAL IMAGE ANALYSIS, 2017, 35 :250-269