Adversarial attack on deep learning-based dermatoscopic image recognition systems Risk of misdiagnosis due to undetectable image perturbations

被引:12
作者
Allyn, Jerome [1 ,2 ]
Allou, Nicolas [1 ,2 ]
Vidal, Charles [1 ]
Renou, Amelie [1 ]
Ferdynus, Cyril [2 ,3 ,4 ]
机构
[1] St Denis Univ Hosp, Intens Care Unit, St Denis, Reunion Island, France
[2] St Denis Univ Hosp, Clin Informat Dept, St Denis, Reunion Island, France
[3] St Denis Univ Hosp, Methodol Support Unit, St Denis, Reunion Island, France
[4] INSERM, CIC 1410, F-97410 St Pierre, Reunion, France
关键词
adversarial attack; artificial intelligence; deep learning; dermatoscopic lesions; image recognition systems; DIABETIC-RETINOPATHY; HEALTH-CARE; VALIDATION; DIAGNOSIS;
D O I
10.1097/MD.0000000000023568
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Deep learning algorithms have shown excellent performances in the field of medical image recognition, and practical applications have been made in several medical domains. Little is known about the feasibility and impact of an undetectable adversarial attacks, which can disrupt an algorithm by modifying a single pixel of the image to be interpreted. The aim of the study was to test the feasibility and impact of an adversarial attack on the accuracy of a deep learning-based dermatoscopic image recognition system. First, the pre-trained convolutional neural network DenseNet-201 was trained to classify images from the training set into 7 categories. Second, an adversarial neural network was trained to generate undetectable perturbations on images from the test set, to classifying all perturbed images as melanocytic nevi. The perturbed images were classified using the model generated in the first step. This study used the HAM-10000 dataset, an open source image database containing 10,015 dermatoscopic images, which was split into a training set and a test set. The accuracy of the generated classification model was evaluated using images from the test set. The accuracy of the model with and without perturbed images was compared. The ability of 2 observers to detect image perturbations was evaluated, and the inter observer agreement was calculated. The overall accuracy of the classification model dropped from 84% (confidence interval (CI) 95%: 82-86) for unperturbed images to 67% (CI 95%: 65-69) for perturbed images (Mc Nemar test, P < .0001). The fooling ratio reached 100% for all categories of skin lesions. Sensitivity and specificity of the combined observers calculated on a random sample of 50 images were 58.3% (CI 95%: 45.9-70.8) and 42.5% (CI 95%: 27.2-57.8), respectively. The kappa agreement coefficient between the 2 observers was negative at -0.22 (CI 95%: -0.49--0.04). Adversarial attacks on medical image databases can distort interpretation by image recognition algorithms, are easy to make and undetectable by humans. It seems essential to improve our understanding of deep learning-based image recognition systems and to upgrade their security before putting them to practical and daily use.
引用
收藏
页数:6
相关论文
共 32 条
  • [1] Abadi M., 2015, TENSORFLOW LARGE SCA
  • [2] [Anonymous], 2015, LIMITATIONS DEEP LEA
  • [3] Cybersecurity for Cardiac Implantable Electronic Devices What Should You Know?
    Baranchuk, Adrian
    Refaat, Marwan M.
    Patton, Kristen K.
    Chung, Mina K.
    Krishnan, Kousik
    Kutyifa, Valentina
    Upadhyay, Gaurav
    Fisher, John D.
    Lakkireddy, Dhanunjaya R.
    [J]. JOURNAL OF THE AMERICAN COLLEGE OF CARDIOLOGY, 2018, 71 (11) : 1284 - 1288
  • [4] On Deep Learning for Medical Image Analysis
    Carin, Lawrence
    Pencina, Michael J.
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2018, 320 (11): : 1192 - 1193
  • [5] Towards Evaluating the Robustness of Neural Networks
    Carlini, Nicholas
    Wagner, David
    [J]. 2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, : 39 - 57
  • [6] Castelvecchi D, 2016, NATURE, V538, P21, DOI [10.1038/nature.2016.20491, 10.1038/538020a]
  • [7] Implementing Machine Learning in Health Care - Addressing Ethical Challenges
    Char, Danton S.
    Shah, Nigam H.
    Magnus, David
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2018, 378 (11) : 981 - 983
  • [8] Data breach remediation efforts and their implications for hospital quality
    Choi, Sung J.
    Johnson, M. Eric
    Lehmann, Christoph U.
    [J]. HEALTH SERVICES RESEARCH, 2019, 54 (05) : 971 - 980
  • [9] Dermatologist-level classification of skin cancer with deep neural networks
    Esteva, Andre
    Kuprel, Brett
    Novoa, Roberto A.
    Ko, Justin
    Swetter, Susan M.
    Blau, Helen M.
    Thrun, Sebastian
    [J]. NATURE, 2017, 542 (7639) : 115 - +
  • [10] Falchi, 2019, DETECTION FACE RECOG