A multimodal transformer to fuse images and metadata for skin disease classification

被引:0
|
作者
Gan Cai
Yu Zhu
Yue Wu
Xiaoben Jiang
Jiongyao Ye
Dawei Yang
机构
[1] East China University of Science and Technology,School of Information Science and Engineering
[2] Zhongshan Hospital,Department of Pulmonary and Critical Care Medicine
[3] Fudan University,undefined
[4] Shanghai Engineering Research Center of Internet of Things for Respiratory Medicine,undefined
来源
The Visual Computer | 2023年 / 39卷
关键词
Skin disease; Deep learning; Transformer; Multimodal fusion; Attention;
D O I
暂无
中图分类号
学科分类号
摘要
Skin disease cases are rising in prevalence, and the diagnosis of skin diseases is always a challenging task in the clinic. Utilizing deep learning to diagnose skin diseases could help to meet these challenges. In this study, a novel neural network is proposed for the classification of skin diseases. Since the datasets for the research consist of skin disease images and clinical metadata, we propose a novel multimodal Transformer, which consists of two encoders for both images and metadata and one decoder to fuse the multimodal information. In the proposed network, a suitable Vision Transformer (ViT) model is utilized as the backbone to extract image deep features. As for metadata, they are regarded as labels and a new Soft Label Encoder (SLE) is designed to embed them. Furthermore, in the decoder part, a novel Mutual Attention (MA) block is proposed to better fuse image features and metadata features. To evaluate the model’s effectiveness, extensive experiments have been conducted on the private skin disease dataset and the benchmark dataset ISIC 2018. Compared with state-of-the-art methods, the proposed model shows better performance and represents an advancement in skin disease diagnosis.
引用
收藏
页码:2781 / 2793
页数:12
相关论文
共 50 条
  • [12] An Attention-Based Mechanism to Combine Images and Metadata in Deep Learning Models Applied to Skin Cancer Classification
    Pacheco, Andre G. C.
    Krohling, Renato A.
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (09) : 3554 - 3563
  • [13] Enhanced deep bottleneck transformer model for skin lesion classification*
    Nakai, Katsuhiro
    Chen, Yen-Wei
    Han, Xian-Hua
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 78
  • [14] Transformer-Based Skin Carcinoma Classification using Histopathology Images via Incremental Learning
    Imran, Muhammad
    Akram, Muhammad Usman
    Salam, Anum Abdul
    2024 14TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION SYSTEMS, ICPRS, 2024,
  • [15] Adversarial multimodal fusion with attention mechanism for skin lesion classification using clinical and dermoscopic images
    Wang, Yan
    Feng, Yangqin
    Zhang, Lei
    Zhou, Joey Tianyi
    Liu, Yong
    Goh, Rick Siow Mong
    Zhen, Liangli
    MEDICAL IMAGE ANALYSIS, 2022, 81
  • [16] An automated deep learning models for classification of skin disease using Dermoscopy images: a comprehensive study
    Anand, Vatsala
    Gupta, Sheifali
    Nayak, Soumya Ranjan
    Koundal, Deepika
    Prakash, Deo
    Verma, K. D.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (26) : 37379 - 37401
  • [17] An automated deep learning models for classification of skin disease using Dermoscopy images: a comprehensive study
    Vatsala Anand
    Sheifali Gupta
    Soumya Ranjan Nayak
    Deepika Koundal
    Deo Prakash
    K. D. Verma
    Multimedia Tools and Applications, 2022, 81 : 37379 - 37401
  • [18] Skin cancer disease images classification using deep learning solutions
    Mijwil, Maad M.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (17) : 26255 - 26271
  • [19] Semantic-Guided Transformer Network for Crop Classification in Hyperspectral Images
    Pi, Weiqiang
    Zhang, Tao
    Wang, Rongyang
    Ma, Guowei
    Wang, Yong
    Du, Jianmin
    JOURNAL OF IMAGING, 2025, 11 (02)
  • [20] Skin cancer disease images classification using deep learning solutions
    Maad M. Mijwil
    Multimedia Tools and Applications, 2021, 80 : 26255 - 26271