Deep neural networks are superior to dermatologists in melanoma image classification

被引:189
作者
Brinker, Titus J. [1 ,2 ]
Hekler, Achim [1 ]
Enk, Alexander H. [2 ]
Berking, Carola [3 ]
Haferkamp, Sebastian [4 ]
Hauschild, Axel [5 ]
Weichenthal, Michael [5 ]
Klode, Joachim [6 ]
Schadendorf, Dirk [6 ]
Holland-Letz, Tim [7 ]
von Kalle, Christof [1 ]
Froehling, Stefan [1 ]
Schilling, Bastian [8 ]
Utikal, Jochen S. [9 ,10 ]
机构
[1] German Canc Res Ctr, Natl Ctr Tumor Dis NCT, Neuenheimer Feld 460, D-69120 Heidelberg, Germany
[2] Univ Hosp Heidelberg, Dept Dermatol, Heidelberg, Germany
[3] Univ Hosp Munich LMU, Dept Dermatol, Munich, Germany
[4] Univ Hosp Regensburg, Dept Dermatol, Regensburg, Germany
[5] Univ Hosp Kiel, Dept Dermatol, Kiel, Germany
[6] Univ Hosp Essen, Dept Dermatol, Essen, Germany
[7] German Canc Res Ctr, Dept Biostat, Heidelberg, Germany
[8] Univ Hosp Wurzburg, Dept Dermatol, Wurzburg, Germany
[9] Heidelberg Univ, Dept Dermatol, Mannheim, Germany
[10] German Canc Res Ctr, Skin Canc Unit, Heidelberg, Germany
关键词
Deep learning; Melanoma; Skin cancer; Artificial intelligence; ARTIFICIAL-INTELLIGENCE; SKIN-CANCER; ALGORITHMS; DIAGNOSIS;
D O I
10.1016/j.ejca.2019.05.023
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background: Melanoma is the most dangerous type of skin cancer but is curable if detected early. Recent publications demonstrated that artificial intelligence is capable in classifying images of benign nevi and melanoma with dermatologist-level precision. However, a statistically significant improvement compared with dermatologist classification has not been reported to date. Methods: For this comparative study, 4204 biopsy-proven images of melanoma and nevi (1:1) were used for the training of a convolutional neural network (CNN). New techniques of deep learning were integrated. For the experiment, an additional 804 biopsy-proven dermoscopic images of melanoma and nevi (1:1) were randomly presented to dermatologists of nine German university hospitals, who evaluated the quality of each image and stated their recommended treatment (19,296 recommendations in total). Three McNemar's tests comparing the results of the CNN's test runs in terms of sensitivity, specificity and overall correctness were predefined as the main outcomes. Findings: The respective sensitivity and specificity of lesion classification by the dermatologists were 67.2% (95% confidence interval [CI]: 62.6%-71.7%) and 62.2% (95% CI: 57.6%-66.9%). In comparison, the trained CNN achieved a higher sensitivity of 82.3% (95% CI: 78.3%-85.7%) and a higher specificity of 77.9% (95% CI: 73.8%-81.8%). The three McNemar's tests in 2 x 2 tables all reached a significance level of p < 0.001. This significance level was sustained for both subgroups. Interpretation: For the first time, automated dermoscopic melanoma image classification was shown to be significantly superior to both junior and board-certified dermatologists (p < 0.001). (C) 2019 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:11 / 17
页数:7
相关论文
共 21 条
[1]  
[Anonymous], EUR J CANC
[2]  
[Anonymous], ARCH PATHOL LAB MED
[3]   Deep learning outperformed 136 of 157 dermatologists in a head-to-head dermoscopic melanoma image classification task [J].
Brinker, Titus J. ;
Hekler, Achim ;
Enk, Alexander H. ;
Klode, Joachim ;
Hauschild, Axel ;
Berking, Carola ;
Schilling, Bastian ;
Haferkamp, Sebastian ;
Schadendorf, Dirk ;
Holland-Letz, Tim ;
Utikal, Jochen S. ;
von Kalle, Christof .
EUROPEAN JOURNAL OF CANCER, 2019, 113 :47-54
[4]   A convolutional neural network trained with dermoscopic images performed on par with 145 dermatologists in a clinical melanoma image classification task [J].
Brinker, Titus J. ;
Hekler, Achim ;
Enk, Alexander H. ;
Klode, Joachim ;
Hauschild, Axel ;
Berking, Carola ;
Schilling, Bastian ;
Haferkamp, Sebastian ;
Schadendorf, Dirk ;
Froehling, Stefan ;
Utikal, Jochen S. ;
von Kalle, Christof ;
Ludwig-Peitsch, Wiebke ;
Sirokay, Judith ;
Heinzerling, Lucie ;
Albrecht, Magarete ;
Baratella, Katharina ;
Bischof, Lena ;
Chorti, Eleftheria ;
Dith, Anna ;
Drusio, Christina ;
Giese, Nina ;
Gratsias, Emmanouil ;
Griewank, Klaus ;
Hallasch, Sandra ;
Hanhart, Zdenka ;
Herz, Saskia ;
Hohaus, Katja ;
Jansen, Philipp ;
Jockenhoefer, Finja ;
Kanaki, Theodora ;
Knispel, Sarah ;
Leonhard, Katja ;
Martaki, Anna ;
Matei, Liliana ;
Matull, Johanna ;
Olischewski, Alexandra ;
Petri, Maximilian ;
Placke, Jan-Malte ;
Raub, Simon ;
Salva, Katrin ;
Schlott, Swantje ;
Sody, Elsa ;
Steingrube, Nadine ;
Stoffels, Ingo ;
Ugurel, Selma ;
Sondermann, Wiebke ;
Zaremba, Anne ;
Gebhardt, Christoffer ;
Booken, Nina .
EUROPEAN JOURNAL OF CANCER, 2019, 111 :148-154
[5]   Comparing artificial intelligence algorithms to 157 German dermatologists: the melanoma classification benchmark [J].
Brinker, Titus J. ;
Hekler, Achim ;
Hauschild, Axel ;
Berking, Carola ;
Schilling, Bastian ;
Enk, Alexander H. ;
Haferkamp, Sebastian ;
Karoglan, Ante ;
von Kalle, Christof ;
Weichenthal, Michael ;
Sattler, Elke ;
Schadendorf, Dirk ;
Gaiser, Maria R. ;
Klode, Joachim ;
Utikal, Jochen S. .
EUROPEAN JOURNAL OF CANCER, 2019, 111 :30-37
[6]   Skin Cancer Classification Using Convolutional Neural Networks: Systematic Review [J].
Brinker, Titus Josef ;
Hekler, Achim ;
Utikal, Jochen Sven ;
Grabe, Niels ;
Schadendorf, Dirk ;
Klode, Joachim ;
Berking, Carola ;
Steeb, Theresa ;
Enk, Alexander H. ;
von Kalle, Christof .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2018, 20 (10)
[7]   Pathologists' diagnosis of invasive melanoma and melanocytic proliferations: observer accuracy and reproducibility study [J].
Elmore, Joann G. ;
Barnhill, Raymond L. ;
Elder, David E. ;
Longton, Gary M. ;
Pepe, Margaret S. ;
Reisch, Lisa M. ;
Carney, Patricia A. ;
Titus, Linda J. ;
Nelson, Heidi D. ;
Onega, Tracy ;
Tosteson, Anna N. A. ;
Weinstock, Martin A. ;
Knezevich, Stevan R. ;
Piepkorn, Michael W. .
BMJ-BRITISH MEDICAL JOURNAL, 2017, 357
[8]   Dermatologist-level classification of skin cancer with deep neural networks [J].
Esteva, Andre ;
Kuprel, Brett ;
Novoa, Roberto A. ;
Ko, Justin ;
Swetter, Susan M. ;
Blau, Helen M. ;
Thrun, Sebastian .
NATURE, 2017, 542 (7639) :115-+
[9]   Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs [J].
Gulshan, Varun ;
Peng, Lily ;
Coram, Marc ;
Stumpe, Martin C. ;
Wu, Derek ;
Narayanaswamy, Arunachalam ;
Venugopalan, Subhashini ;
Widner, Kasumi ;
Madams, Tom ;
Cuadros, Jorge ;
Kim, Ramasamy ;
Raman, Rajiv ;
Nelson, Philip C. ;
Mega, Jessica L. ;
Webster, R. .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2016, 316 (22) :2402-2410
[10]   Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists [J].
Haenssle, H. A. ;
Fink, C. ;
Schneiderbauer, R. ;
Toberer, F. ;
Buhl, T. ;
Blum, A. ;
Kalloo, A. ;
Hassens, A. Ben Hadj ;
Thomas, L. ;
Enk, A. ;
Uhlmann, L. .
ANNALS OF ONCOLOGY, 2018, 29 (08) :1836-1842