Multi-label classification of retinal disease via a novel vision transformer model

被引:4
作者
Wang, Dong [1 ]
Lian, Jian [2 ]
Jiao, Wanzhen [3 ]
机构
[1] Shandong Jiaotong Univ, Sch Informat Sci & Elect Engn, Jinan, Peoples R China
[2] Shandong Management Univ, Sch Intelligence Engn, Jinan, Peoples R China
[3] Shandong First Med Univ, Shandong Prov Hosp, Dept Ophthalmol, Jinan, Peoples R China
关键词
retinal image; deep learning; multi-label classification; machine vision; medical image analysis;
D O I
10.3389/fnins.2023.1290803
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Introduction The precise identification of retinal disorders is of utmost importance in the prevention of both temporary and permanent visual impairment. Prior research has yielded encouraging results in the classification of retinal images pertaining to a specific retinal condition. In clinical practice, it is not uncommon for a single patient to present with multiple retinal disorders concurrently. Hence, the task of classifying retinal images into multiple labels remains a significant obstacle for existing methodologies, but its successful accomplishment would yield valuable insights into a diverse array of situations simultaneously.Methods This study presents a novel vision transformer architecture called retinal ViT, which incorporates the self-attention mechanism into the field of medical image analysis. To note that this study supposed to prove that the transformer-based models can achieve competitive performance comparing with the CNN-based models, hence the convolutional modules have been eliminated from the proposed model. The suggested model concludes with a multi-label classifier that utilizes a feed-forward network architecture. This classifier consists of two layers and employs a sigmoid activation function.Results and discussion The experimental findings provide evidence of the improved performance exhibited by the suggested model when compared to state-of-the-art approaches such as ResNet, VGG, DenseNet, and MobileNet, on the publicly available dataset ODIR-2019, and the proposed approach has outperformed the state-of-the-art algorithms in terms of Kappa, F1 score, AUC, and AVG.
引用
收藏
页数:11
相关论文
共 44 条
  • [1] Abramoff Michael D, 2010, IEEE Rev Biomed Eng, V3, P169, DOI 10.1109/RBME.2010.2084567
  • [2] Minimized Computations of Deep Learning Technique for Early Diagnosis of Diabetic Retinopathy Using IoT-Based Medical Devices
    Ayoub, Shahnawaz
    Khan, Mohiuddin Ali
    Jadhav, Vaishali Prashant
    Anandaram, Harishchander
    Kumar, T. Ch. Anil
    Reegu, Faheem Ahmad
    Motwani, Deepak
    Shrivastava, Ashok Kumar
    Berhane, Roviel
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [3] Ba J.L., 2016, arXiv, DOI DOI 10.48550/ARXIV.1607.06450
  • [4] Features extraction using encoded local binary pattern for detection and grading diabetic retinopathy
    Berbar, Mohamed A.
    [J]. HEALTH INFORMATION SCIENCE AND SYSTEMS, 2022, 10 (01)
  • [5] A Multi-Label Classification with an Adversarial-Based Denoising Autoencoder for Medical Image Annotation
    Chai, Yidong
    Liu, Hongyan
    Xu, Jie
    Samtani, Sagar
    Jiang, Yuanchun
    Liu, Haoxin
    [J]. ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2023, 14 (02)
  • [6] Chen Z., 2014, ACCV WORKSHOPS
  • [7] Attention-based Dropout Layer for Weakly Supervised Object Localization
    Choe, Junsuk
    Shim, Hyunjung
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2214 - 2223
  • [8] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
  • [9] Local receptive field based extreme learning machine with three channels for histopathological image classification
    Fang, Jing
    Xu, Xinying
    Liu, Huaping
    Sun, Fuchun
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (06) : 1437 - 1447
  • [10] Attention in Natural Language Processing
    Galassi, Andrea
    Lippi, Marco
    Torroni, Paolo
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (10) : 4291 - 4308