A multimodal transformer to fuse images and metadata for skin disease classification

被引：0

作者：

Gan Cai

Yu Zhu

Yue Wu

Xiaoben Jiang

Jiongyao Ye

Dawei Yang

机构：

[1] East China University of Science and Technology,School of Information Science and Engineering

[2] Zhongshan Hospital,Department of Pulmonary and Critical Care Medicine

[3] Fudan University,undefined

[4] Shanghai Engineering Research Center of Internet of Things for Respiratory Medicine,undefined

来源：

The Visual Computer | 2023年 / 39卷

关键词：

Skin disease; Deep learning; Transformer; Multimodal fusion; Attention;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Skin disease cases are rising in prevalence, and the diagnosis of skin diseases is always a challenging task in the clinic. Utilizing deep learning to diagnose skin diseases could help to meet these challenges. In this study, a novel neural network is proposed for the classification of skin diseases. Since the datasets for the research consist of skin disease images and clinical metadata, we propose a novel multimodal Transformer, which consists of two encoders for both images and metadata and one decoder to fuse the multimodal information. In the proposed network, a suitable Vision Transformer (ViT) model is utilized as the backbone to extract image deep features. As for metadata, they are regarded as labels and a new Soft Label Encoder (SLE) is designed to embed them. Furthermore, in the decoder part, a novel Mutual Attention (MA) block is proposed to better fuse image features and metadata features. To evaluate the model’s effectiveness, extensive experiments have been conducted on the private skin disease dataset and the benchmark dataset ISIC 2018. Compared with state-of-the-art methods, the proposed model shows better performance and represents an advancement in skin disease diagnosis.

引用

页码：2781 / 2793

页数：12

共 50 条

[1] A multimodal transformer to fuse images and metadata for skin disease classification
Cai, Gan
Zhu, Yu
Wu, Yue
Jiang, Xiaoben
Ye, Jiongyao
Yang, Dawei
VISUAL COMPUTER, 2023, 39 (07) : 2781 - 2793
[2] A deep learning based multimodal fusion model for skin lesion diagnosis using smartphone collected clinical images and metadata
Ou, Chubin
Zhou, Sitong
Yang, Ronghua
Jiang, Weili
He, Haoyang
Gan, Wenjun
Chen, Wentao
Qin, Xinchi
Luo, Wei
Pi, Xiaobing
Li, Jiehua
FRONTIERS IN SURGERY, 2022, 9
[3] Multimodal Region-Based Transformer for the Classification and Prediction of Alzheimer's Disease
Mueller, Kevin
Meyer-Baese, Anke
Erlebacher, Gordon
MEDICAL IMAGING 2022: BIOMEDICAL APPLICATIONS IN MOLECULAR, STRUCTURAL, AND FUNCTIONAL IMAGING, 2022, 12036
[4] Impact of metadata in multimodal classification of bone tumours
Hinterwimmer, Florian
Guenther, Michael
Consalvo, Sarah
Neumann, Jan
Gersing, Alexandra
Woertler, Klaus
von Eisenhart-Rothe, Ruediger
Burgkart, Rainer
Rueckert, Daniel
BMC MUSCULOSKELETAL DISORDERS, 2024, 25 (01)
[5] DeepMetaForge: A Deep Vision-Transformer Metadata-Fusion Network for Automatic Skin Lesion Classification
Vachmanus, Sirawich
Noraset, Thanapon
Piyanonpong, Waritsara
Rattananukrom, Teerapong
Tuarob, Suppawong
IEEE ACCESS, 2023, 11 : 145467 - 145484
[6] TUFusion: A Transformer-Based Universal Fusion Algorithm for Multimodal Images
Zhao, Yangyang
Zheng, Qingchun
Zhu, Peihao
Zhang, Xu
Ma, Wenpeng
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1712 - 1725
[7] A Novel Vision Transformer Model for Skin Cancer Classification
Guang Yang
Suhuai Luo
Peter Greer
Neural Processing Letters, 2023, 55 : 9335 - 9351
[8] A Novel Vision Transformer Model for Skin Cancer Classification
Yang, Guang
Luo, Suhuai
Greer, Peter
NEURAL PROCESSING LETTERS, 2023, 55 (07) : 9335 - 9351
[9] A multimodal hyper-fusion transformer for remote sensing image classification
Ma, Mengru
Ma, Wenping
Jiao, Licheng
Liu, Xu
Li, Lingling
Feng, Zhixi
Liu, Fang
Yang, Shuyuan
INFORMATION FUSION, 2023, 96 : 66 - 79
[10] Visual and Linguistic Double Transformer Fusion Model for Multimodal Tweet Classification
Zhou, Jinyan
Wang, Xingang
Liu, Ning
Liu, Xiaoyu
Lv, Jiandong
Li, Xiaomin
Zhang, Hong
Cao, Rui
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,

← 1 2 3 4 5 →