Skin Lesion Segmentation Based on Vision Transformers and Convolutional Neural Networks-A Comparative Study

被引:60
作者
Gulzar, Yonis [1 ]
Khan, Sumeer Ahmad [2 ]
机构
[1] King Faisal Univ, Coll Business Adm, Dept Management Informat Syst, Al Hufuf 31982, Saudi Arabia
[2] King Abdullah Univ Sci & Technol, Biol & Environm Sci & Engn, Jeddah 23955, Saudi Arabia
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 12期
关键词
melanoma; lesion; segmentation; transformers; convolutional neural networks; EPILUMINESCENCE MICROSCOPY; ABCD RULE; CLASSIFICATION; DERMATOSCOPY; MELANOMA; CANCER;
D O I
10.3390/app12125990
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Melanoma skin cancer is considered as one of the most common diseases in the world. Detecting such diseases at early stage is important to saving lives. During medical examinations, it is not an easy task to visually inspect such lesions, as there are similarities between lesions. Technological advances in the form of deep learning methods have been used for diagnosing skin lesions. Over the last decade, deep learning, especially CNN (convolutional neural networks), has been found one of the promising methods to achieve state-of-art results in a variety of medical imaging applications. However, ConvNets' capabilities are considered limited due to the lack of understanding of long-range spatial relations in images. The recently proposed Vision Transformer (ViT) for image classification employs a purely self-attention-based model that learns long-range spatial relations to focus on the image's relevant parts. To achieve better performance, existing transformer-based network architectures require large-scale datasets. However, because medical imaging datasets are small, applying pure transformers to medical image analysis is difficult. ViT emphasizes the low-resolution features, claiming that the successive downsampling results in a lack of detailed localization information, rendering it unsuitable for skin lesion image classification. To improve the recovery of detailed localization information, several ViT-based image segmentation methods have recently been combined with ConvNets in the natural image domain. This study provides a comprehensive comparative study of U-Net and attention-based methods for skin lesion image segmentation, which will assist in the diagnosis of skin lesions. The results show that the hybrid TransUNet, with an accuracy of 92.11% and dice coefficient of 89.84%, outperforms other benchmarking methods.
引用
收藏
页数:17
相关论文
共 42 条
[1]   Microscopic skin laceration segmentation and classification: A framework of statistical normal distribution and optimal feature selection [J].
Afza, Farhat ;
Khan, Muhammad A. ;
Sharif, Muhammad ;
Rehman, Amjad .
MICROSCOPY RESEARCH AND TECHNIQUE, 2019, 82 (09) :1471-1488
[2]   Skin lesion segmentation in dermoscopy images via deep full resolution convolutional networks [J].
Al-Masni, Mohammed A. ;
Al-antari, Mugahed A. ;
Choi, Mun-Taek ;
Han, Seung-Moo ;
Kim, Tae-Seong .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2018, 162 :221-231
[3]   Automatic Focus Assessment on Dermoscopic Images Acquired with Smartphones [J].
Alves, Jose ;
Moreira, Dinis ;
Alves, Pedro ;
Rosado, Luis ;
Vasconcelos, Maria Joao M. .
SENSORS, 2019, 19 (22)
[4]   Metastatic Malignant Melanoma: A Case Study [J].
Anand, Swarup ;
Verma, Radha ;
Vaja, Chirag ;
Bade, Rakesh ;
Shah, Amiti ;
Gaikwad, Kiran .
INTERNATIONAL JOURNAL OF SCIENTIFIC STUDY, 2016, 4 (06) :188-190
[5]  
[Anonymous], WHO KEY FACTS CANC
[6]  
[Anonymous], 2017, CoRR abs/1703.05165
[7]   Epiluminescence microscopy for the diagnosis of doubtful melanocytic skin lesions - Comparison of the ABCD rule of dermatoscopy and a new 7-Point checklist based on pattern analysis [J].
Argenziano, G ;
Fabbrocini, G ;
Carli, P ;
De Giorgi, V ;
Sammarco, E ;
Delfino, M .
ARCHIVES OF DERMATOLOGY, 1998, 134 (12) :1563-1570
[8]  
Berseth M, 2017, ARXIV
[9]   Step-wise integration of deep class-specific learning for dermoscopic image segmentation [J].
Bi, Lei ;
Kim, Jinman ;
Ahn, Euijoon ;
Kumar, Ashnil ;
Feng, Dagan ;
Fulham, Michael .
PATTERN RECOGNITION, 2019, 85 :78-89
[10]  
Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9