Multi-label classification of retinal disease via a novel vision transformer model

被引：4

作者：

Wang, Dong ^{[1
]}

Lian, Jian ^{[2
]}

Jiao, Wanzhen ^{[3
]}

机构：

[1] Shandong Jiaotong Univ, Sch Informat Sci & Elect Engn, Jinan, Peoples R China

[2] Shandong Management Univ, Sch Intelligence Engn, Jinan, Peoples R China

[3] Shandong First Med Univ, Shandong Prov Hosp, Dept Ophthalmol, Jinan, Peoples R China

来源：

FRONTIERS IN NEUROSCIENCE | 2024年 / 17卷

关键词：

retinal image; deep learning; multi-label classification; machine vision; medical image analysis;

D O I：

10.3389/fnins.2023.1290803

中图分类号：

Q189 [神经科学];

学科分类号：

071006 ;

摘要：

Introduction The precise identification of retinal disorders is of utmost importance in the prevention of both temporary and permanent visual impairment. Prior research has yielded encouraging results in the classification of retinal images pertaining to a specific retinal condition. In clinical practice, it is not uncommon for a single patient to present with multiple retinal disorders concurrently. Hence, the task of classifying retinal images into multiple labels remains a significant obstacle for existing methodologies, but its successful accomplishment would yield valuable insights into a diverse array of situations simultaneously.Methods This study presents a novel vision transformer architecture called retinal ViT, which incorporates the self-attention mechanism into the field of medical image analysis. To note that this study supposed to prove that the transformer-based models can achieve competitive performance comparing with the CNN-based models, hence the convolutional modules have been eliminated from the proposed model. The suggested model concludes with a multi-label classifier that utilizes a feed-forward network architecture. This classifier consists of two layers and employs a sigmoid activation function.Results and discussion The experimental findings provide evidence of the improved performance exhibited by the suggested model when compared to state-of-the-art approaches such as ResNet, VGG, DenseNet, and MobileNet, on the publicly available dataset ODIR-2019, and the proposed approach has outperformed the state-of-the-art algorithms in terms of Kappa, F1 score, AUC, and AVG.

引用

页数：11

共 44 条

[1] Abramoff Michael D, 2010, IEEE Rev Biomed Eng, V3, P169, DOI 10.1109/RBME.2010.2084567
[2] Minimized Computations of Deep Learning Technique for Early Diagnosis of Diabetic Retinopathy Using IoT-Based Medical Devices
Ayoub, Shahnawaz
Khan, Mohiuddin Ali
Jadhav, Vaishali Prashant
Anandaram, Harishchander
Kumar, T. Ch. Anil
Reegu, Faheem Ahmad
Motwani, Deepak
Shrivastava, Ashok Kumar
Berhane, Roviel
[J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
[3] Ba J.L., 2016, arXiv, DOI DOI 10.48550/ARXIV.1607.06450
[4] Features extraction using encoded local binary pattern for detection and grading diabetic retinopathy
Berbar, Mohamed A.
[J]. HEALTH INFORMATION SCIENCE AND SYSTEMS, 2022, 10 (01)
[5] A Multi-Label Classification with an Adversarial-Based Denoising Autoencoder for Medical Image Annotation
Chai, Yidong
Liu, Hongyan
Xu, Jie
Samtani, Sagar
Jiang, Yuanchun
Liu, Haoxin
[J]. ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2023, 14 (02)
[6] Chen Z., 2014, ACCV WORKSHOPS
[7] Attention-based Dropout Layer for Weakly Supervised Object Localization
Choe, Junsuk
Shim, Hyunjung
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2214 - 2223
[8] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[9] Local receptive field based extreme learning machine with three channels for histopathological image classification
Fang, Jing
Xu, Xinying
Liu, Huaping
Sun, Fuchun
[J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (06) : 1437 - 1447
[10] Attention in Natural Language Processing
Galassi, Andrea
Lippi, Marco
Torroni, Paolo
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (10) : 4291 - 4308

← 1 2 3 4 5 →