Analysis of the ISIC image datasets: Usage, benchmarks and recommendations

被引:116
作者
Cassidy, Bill [1 ]
Kendrick, Connah [1 ]
Brodzicki, Andrzej [2 ]
Jaworek-Korjakowska, Joanna [2 ]
Yap, Moi Hoon [1 ]
机构
[1] Manchester Metropolitan Univ, John Dalton Bldg,Chester St, Manchester M1 5GD, Lancs, England
[2] AGH Univ Sci & Technol, Al Mickiewicza 30, PL-30059 Krakow, Poland
基金
英国工程与自然科学研究理事会;
关键词
Skin cancer; Skin lesion classification; Deep convolutional neural networks; ISIC; Melanoma; SKIN-LESION SEGMENTATION; CONVOLUTIONAL NEURAL-NETWORK; DERMOSCOPIC IMAGES; CLASSIFICATION; MELANOMA; DERMATOLOGISTS; SUPERIOR; FEATURES; CANCER;
D O I
10.1016/j.media.2021.102305
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The International Skin Imaging Collaboration (ISIC) datasets have become a leading repository for researchers in machine learning for medical image analysis, especially in the field of skin cancer detection and malignancy assessment. They contain tens of thousands of dermoscopic photographs together with gold-standard lesion diagnosis metadata. The associated yearly challenges have resulted in major contributions to the field, with papers reporting measures well in excess of human experts. Skin cancers can be divided into two major groups - melanoma and non-melanoma. Although less prevalent, melanoma is considered to be more serious as it can quickly spread to other organs if not treated at an early stage. In this paper, we summarise the usage of the ISIC dataset images and present an analysis of yearly releases over a period of 2016 - 2020. Our analysis found a significant number of duplicate images, both within and between the datasets. Additionally, we also noted duplicates spread across testing and training sets. Due to these irregularities, we propose a duplicate removal strategy and recommend a curated dataset for researchers to use when working on ISIC datasets. Given that ISIC 2020 focused on melanoma classification, we conduct experiments to provide benchmark results on the ISIC 2020 test set, with additional analysis on the smaller ISIC 2017 test set. Testing was completed following the application of our duplicate removal strategy and an additional data balancing step. As a result of removing 14,310 duplicate images from the training set, our benchmark results show good levels of melanoma prediction with an AUC of 0.80 for the best performing model. As our aim was not to maximise network performance, we did not include additional steps in our experiments. Finally, we provide recommendations for future research by highlighting irregularities that may present research challenges. A list of image files with reference to the original ISIC dataset sources for the recommended curated training set will be shared on our GitHub repository (available at www.github.com/mmu-dermatology-research/isic _ duplicate _ removal _ strategy ). (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:15
相关论文
共 79 条
[1]   Deep learning techniques for skin lesion analysis and melanoma cancer detection: a survey of state-of-the-art [J].
Adegun, Adekanmi ;
Viriri, Serestina .
ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (02) :811-841
[2]   Deep Learning-Based System for Automatic Melanoma Detection [J].
Adegun, Adekanmi A. ;
Viriri, Serestina .
IEEE ACCESS, 2020, 8 :7160-7172
[3]  
Al-antari M.A, 2018, AUTOMATIC RECOGNITIO
[4]   Melanoma and Nevus Skin Lesion Classification Using Handcraft and Deep Learning Feature Fusion via Mutual Information Measures [J].
Almaraz-Damian, Jose-Agustin ;
Ponomaryov, Volodymyr ;
Sadovnychiy, Sergiy ;
Castillejos-Fernandez, Heydy .
ENTROPY, 2020, 22 (04)
[5]  
American Institute for Cancer Research, 2018, SKIN CANC STAT
[6]  
[Anonymous], 2017, Skin cancer facts and Statistics
[7]   Improving Dermoscopy Image Classification Using Color Constancy [J].
Barata, Catarina ;
Celebi, M. Emre ;
Marques, Jorge S. .
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2015, 19 (03) :1146-1152
[8]  
Barbosa J, 2019, MELANOMA DETECTION U
[9]   Towards Automated Melanoma Detection with Deep Learning: Data Purification and Augmentation [J].
Bisla, Devansh ;
Choromanska, Anna ;
Berman, Russell S. ;
Stein, Jennifer A. ;
Polsky, David .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, :2720-2728
[10]  
Bissoto A., 2019, IEEE COMPUT SOC CONF, DOI DOI 10.1109/CVPRW.2019.00335