Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small Datasets

被引：8

作者：

de Lima, Leandro M. ^{[1
,2
]}

Krohling, Renato A. ^{[1
,2
]}

机构：

[1] Univ Fed Espirito Santo, Grad Program Comp Sci, Vitoria, ES, Brazil

[2] Univ Fed Espirito Santo, DEPR, Labcin Nat Inspired Comp Lab, Vitoria, ES, Brazil

来源：

INTELLIGENT SYSTEMS, PT II | 2022年 / 13654卷

关键词：

Transformer; Convolutional neural network; Skin lesion; Multimodal fusion; Classification;

D O I：

10.1007/978-3-031-21689-3_21

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Skin cancer is one of the most common types of cancer in the world. Different computer-aided diagnosis systems have been proposed to tackle skin lesion diagnosis, most of them based on deep convolutional neural networks. However, recent advances in computer vision achieved state-of-the-art results in many tasks, notably transformer-based networks. We explore and evaluate advances in computer vision architectures, training methods and multimodal feature fusion for skin lesion diagnosis task. Experiments show that PiT (0.800 +/- 0.006), CoaT (0.780 +/- 0.024) and ViT (0.771 +/- 0.018) transformer-based backbone models with MetaBlock fusion achieved state-of-the-art results for balanced accuracy on PAD-UFES-20 dataset.

引用

页码：282 / 296

页数：15

共 57 条

[21] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[22] Rethinking Spatial Dimensions of Vision Transformers [J].

Heo, Byeongho ;

Yun, Sangdoo ;

Han, Dongyoon ;

Chun, Sanghyuk ;

Choe, Junsuk ;

Oh, Seong Joon .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :11916-11925

[23]

Hinton G, 2015, Arxiv, DOI arXiv:1503.02531

[24] Eff2Net: An efficient channel attention-based convolutional neural network for skin disease classification [J].

Karthik, R. ;

Vaichole, Tejas Sunil ;

Kulkarni, Sanika Kiran ;

Yadav, Ojaswa ;

Khan, Faiz .

BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 73

[25] Transformers in Vision: A Survey [J].

Khan, Salman ;

Naseer, Muzammal ;

Hayat, Munawar ;

Zamir, Syed Waqas ;

Khan, Fahad Shahbaz ;

Shah, Mubarak .

ACM COMPUTING SURVEYS, 2022, 54 (10S)

[26] Big Transfer (BiT): General Visual Representation Learning [J].

Kolesnikov, Alexander ;

Beyer, Lucas ;

Zhai, Xiaohua ;

Puigcerver, Joan ;

Yung, Jessica ;

Gelly, Sylvain ;

Houlsby, Neil .

COMPUTER VISION - ECCV 2020, PT V, 2020, 12350 :491-507

[27]

Li WP, 2020, I S BIOMED IMAGING, P1996, DOI [10.1109/isbi45749.2020.9098645, 10.1109/ISBI45749.2020.9098645]

[28]

Liu Y, 2022, Arxiv, DOI [arXiv:2111.06091, DOI 10.1109/TNNLS.2022.3227717, 10.48550/ARXIV.2111.06091]

[29] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows [J].

Liu, Ze ;

Lin, Yutong ;

Cao, Yue ;

Hu, Han ;

Wei, Yixuan ;

Zhang, Zheng ;

Lin, Stephen ;

Guo, Baining .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9992-10002

[30]

Naseer M, 2021, ADV NEUR IN, V34

← 1 2 3 4 5 6 →