Man against machine reloaded: performance of a market-approved convolutional neural network in classifying a broad spectrum of skin lesions in comparison with 96 dermatologists working under less artificial conditions

被引:127
作者
Haenssle, H. A. [1 ]
Fink, C. [1 ]
Toberer, F. [1 ]
Winkler, J. [1 ]
Stolz, W. [2 ]
Deinlein, T. [3 ]
Hofmann-Wellenhof, R. [3 ]
Lallas, A. [4 ]
Emmer, S. [5 ]
Buhl, T. [6 ]
Zutt, M. [7 ]
Blum, A. [8 ]
Abassi, M. S. [9 ]
Thomas, L. [10 ]
Tromme, I [11 ]
Tschandl, P. [12 ]
Enk, A. [1 ]
Rosenberger, A. [13 ]
机构
[1] Heidelberg Univ, Dept Dermatol, Neuenheimer Feld 440, D-69120 Heidelberg, Germany
[2] Dept Dermatol Allergol & Environm Med 2, Munich, Germany
[3] Med Univ Graz, Dept Dermatol & Venerol, Graz, Austria
[4] Aristotle Univ Thessaloniki, Dept Dermatol 1, Thessaloniki, Greece
[5] Univ Rostock, Dept Dermatol, Rostock, Germany
[6] Univ Gottingen, Dept Dermatol, Gottingen, Germany
[7] Klinikum Bremen Mitte, Dept Dermatol & Allergol, Bremen, Germany
[8] Off Based Clin Dermatol, Constance, Germany
[9] Univ Passau, Fac Comp Sci & Math, Passau, Germany
[10] Lyon 1 Univ, Lyons Canc Res Ctr, Dept Dermatol, Lyon, France
[11] Catholic Univ Louvain, St Luc Univ Hosp, Dept Dermatol, Brussels, Belgium
[12] Med Univ Vienna, Dept Dermatol, Vienna, Austria
[13] Univ Goettingen, Dept Genet Epidemiol, Gottingen, Germany
关键词
deep learning; neural network; Moleanalyzer Pro; skin cancer; melanoma; dermoscopy; DIAGNOSIS; MELANOMA; CANCER; MULTICENTER; MANAGEMENT; SYSTEM;
D O I
10.1016/j.annonc.2019.10.013
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background: Convolutional neural networks (CNNs) efficiently differentiate skin lesions by image analysis. Studies comparing a market-approved CNN in a broad range of diagnoses to dermatologists working under less artificial conditions are lacking. Materials and methods: One hundred cases of pigmented/non-pigmented skin cancers and benign lesions were used for a two-level reader study in 96 dermatologists (level I: dermoscopy only; level II: clinical close-up images, dermoscopy, and textual information). Additionally, dermoscopic images were classified by a CNN approved for the European market as a medical device (Moleanalyzer Pro, FotoFinder Systems, Bad Birnbach, Germany). Primary endpoints were the sensitivity and specificity of the CNN's dichotomous classification in comparison with the dermatologists' management decisions. Secondary endpoints included the dermatologists' diagnostic decisions, their performance according to their level of experience, and the CNN's area under the curve (AUC) of receiver operating characteristics (ROC). Results: The CNN revealed a sensitivity, specificity, and ROC AUC with corresponding 95% confidence intervals (CI) of 95.0% (95% CI 83.5% to 98.6%), 76.7% (95% CI 64.6% to 85.6%), and 0.918 (95% CI 0.866-0.970), respectively. In level I, the dermatologists' management decisions showed a mean sensitivity and specificity of 89.0% (95% CI 87.4% to 90.6%) and 80.7% (95% CI 78.8% to 82.6%). With level II information, the sensitivity significantly improved to 94.1% (95% CI 93.1% to 95.1%; P < 0.001), while the specificity remained unchanged at 80.4% (95% CI 78.4% to 82.4%; P = 0.97). When fixing the CNN's specificity at the mean specificity of the dermatologists' management decision in level II (80.4%), the CNN's sensitivity was almost equal to that of human raters, at 95% (95% CI 83.5% to 98.6%) versus 94.1% (95% CI 93.1% to 95.1%); P = 0.1. In contrast, dermatologists were outperformed by the CNN in their level I management decisions and level I and II diagnostic decisions. More experienced dermatologists frequently surpassed the CNN's performance. Conclusions: Under less artificial conditions and in a broader spectrum of diagnoses, the CNN and most dermatologists performed on the same level. Dermatologists are trained to integrate information from a range of sources rendering comparative studies that are solely based on one single case image inadequate.
引用
收藏
页码:137 / 143
页数:7
相关论文
共 20 条
[1]   Skin Cancer: Epidemiology, Disease Burden, Pathophysiology, Diagnosis, and Therapeutic Approaches [J].
Apalla, Zoe ;
Nashan, Dorothee ;
Weller, Richard B. ;
Castellsague, Xavier .
DERMATOLOGY AND THERAPY, 2017, 7 :S5-S19
[2]   On the Comparison of Diagnosis and Management of Melanoma Between Dermatologists and MelaFind [J].
Cukras, Anthony R. .
JAMA DERMATOLOGY, 2013, 149 (05) :622-623
[3]   A SAS® macro implementation of a multiple comparison post hoc test for a Kruskal-Wallis analysis [J].
Elliott, Alan C. ;
Hynan, Linda S. .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2011, 102 (01) :75-80
[4]   Dermatologist-level classification of skin cancer with deep neural networks [J].
Esteva, Andre ;
Kuprel, Brett ;
Novoa, Roberto A. ;
Ko, Justin ;
Swetter, Susan M. ;
Blau, Helen M. ;
Thrun, Sebastian .
NATURE, 2017, 542 (7639) :115-+
[5]   Non-invasive tools for the diagnosis of cutaneous melanoma [J].
Fink, C. ;
Haenssle, H. A. .
SKIN RESEARCH AND TECHNOLOGY, 2017, 23 (03) :261-271
[6]   Teledermatology for the Diagnosis and Management of Skin Cancer A Systematic Review [J].
Finnane, Anna ;
Dallest, Kathy ;
Janda, Monika ;
Soyer, H. Peter .
JAMA DERMATOLOGY, 2017, 153 (03) :319-327
[7]   Diagnostic accuracy of dermatofluoroscopy in cutaneous melanoma detection: results of a prospective multicentre clinical study in 476 pigmented lesions [J].
Forschner, A. ;
Keim, U. ;
Hofmann, M. ;
Spaenkuch, I. ;
Lomberg, D. ;
Weide, B. ;
Tampouri, I. ;
Eigentler, T. ;
Fink, C. ;
Garbe, C. ;
Haenssle, H. A. .
BRITISH JOURNAL OF DERMATOLOGY, 2018, 179 (02) :478-485
[8]   Deep-learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumour diagnosis [J].
Fujisawa, Y. ;
Otomo, Y. ;
Ogata, Y. ;
Nakamura, Y. ;
Fujita, R. ;
Ishitsuka, Y. ;
Watanabe, R. ;
Okiyama, N. ;
Ohara, K. ;
Fujimoto, M. .
BRITISH JOURNAL OF DERMATOLOGY, 2019, 180 (02) :373-381
[9]   Artificial intelligence and melanoma diagnosis: ignoring human nature may lead to false predictions [J].
Lallas, Aimilios ;
Argenziano, Giuseppe .
DERMATOLOGY PRACTICAL & CONCEPTUAL, 2018, 8 (04) :249-251
[10]   Dermoscopic assisted diagnosis in melanoma: Reviewing results, optimizing methodologies and quantifying empirical guidelines [J].
Lee, Huei Diana ;
Mendes, Ana Isabel ;
Spolaor, Newton ;
Oliva, Jefferson Tales ;
Sabino Parmezan, Antonio Rafael ;
Wu, Feng Chung ;
Fonseca-Pinto, Rui .
KNOWLEDGE-BASED SYSTEMS, 2018, 158 :9-24