A Novel Transformer Model With Multiple Instance Learning for Diabetic Retinopathy Classification

被引:10
作者
Yang, Yaoming [1 ]
Cai, Zhili [1 ]
Qiu, Shuxia [1 ,2 ]
Xu, Peng [1 ,2 ]
机构
[1] China Jiliang Univ, Coll Sci, Hangzhou 310018, Peoples R China
[2] Key Lab Intelligent Mfg Qual Big Data Tracing & An, Hangzhou 310018, Peoples R China
基金
中国国家自然科学基金;
关键词
Vision Transformer; multiple instance learning; diabetic retinopathy; high-resolution fundus retinal images; medical image classification; DISEASE; IMAGES;
D O I
10.1109/ACCESS.2024.3351473
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Diabetic retinopathy (DR) is an irreversible fundus retinopathy. A deep learning-based auto-mated DR diagnosis system can save diagnostic time. While Transformer has shown superior performance compared to Convolutional Neural Network (CNN), it typically requires pre-training with large amounts of data. Although Transformer-based DR diagnosis method may alleviate the problem of limited performance on small-scale retinal datasets by loading pre-trained weights, the size of input images is restricted to 224 x 224. The resolution of retinal images captured by fundus cameras is much higher than 224 x 224, reducing resolution in training will result in the loss of valuable information. In order to efficiently utilize high-resolution retinal images, a new Transformer model with multiple instance learning (TMIL) is proposed for DR classification. A multiple instance learning approach is firstly applied on the retinal images to segment these high-resolution images into 224 x 224 image patches. Subsequently, Vision Transformer (ViT) is used to extract features from each patch. Then, Global Instance Computing Block (GICB) is designed to calculate the inter-instance features. After introducing global information from GICB, the features are used to output the classification results. When using high-resolution retinal images, TMIL can load pre-trained weights of Transformer without being affected by weight interpolation on model performance. Experimental results using the APTOS dataset and the Messidor-1 dataset demonstrate that TMIL achieves better classification performance and reduces inference time by 62% compared with that directly inputting high-resolution images into ViT. And TMIL shows highest classification accuracy compared with the current state-of-the-art results.
引用
收藏
页码:6768 / 6776
页数:9
相关论文
共 38 条
[1]   Automatic Diabetic Retinopathy Grading System Based on Detecting Multiple Retinal Lesions [J].
Abdelmaksoud, Eman ;
El-Sappagh, Shaker ;
Barakat, Sherif ;
Abuhmed, Tamer ;
Elmogy, Mohammed .
IEEE ACCESS, 2021, 9 :15939-15960
[2]   Texture Attention Network for Diabetic Retinopathy Classification [J].
Alahmadi, Mohammad D. .
IEEE ACCESS, 2022, 10 :55522-55532
[3]   Diabetic Retinopathy Fundus Image Classification and Lesions Localization System Using Deep Learning [J].
Alyoubi, Wejdan L. ;
Abulkhair, Maysoon F. ;
Shalash, Wafaa M. .
SENSORS, 2021, 21 (11)
[4]   Multiple instance classification: Review, taxonomy and comparative study [J].
Amores, Jaume .
ARTIFICIAL INTELLIGENCE, 2013, 201 :81-105
[5]   Deep learning based computer-aided diagnosis systems for diabetic retinopathy: A survey [J].
Asiri, Norah ;
Hussain, Muhammad ;
Al Adel, Fadwa ;
Alzaidi, Nazih .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2019, 99
[6]   Multiple instance learning with bag dissimilarities [J].
Cheplygina, Veronika ;
Tax, David M. J. ;
Loog, Marco .
PATTERN RECOGNITION, 2015, 48 (01) :264-275
[7]   IDF Diabetes Atlas: Global estimates of diabetes prevalence for 2017 and projections for 2045 [J].
Cho, N. H. ;
Shaw, J. E. ;
Karuranga, S. ;
Huang, Y. ;
Fernandes, J. D. da Rocha ;
Ohlrogge, A. W. ;
Malanda, B. .
DIABETES RESEARCH AND CLINICAL PRACTICE, 2018, 138 :271-281
[8]   FEEDBACK ON A PUBLICLY DISTRIBUTED IMAGE DATABASE: THE MESSIDOR DATABASE [J].
Decenciere, Etienne ;
Zhang, Xiwei ;
Cazuguel, Guy ;
Lay, Bruno ;
Cochener, Beatrice ;
Trone, Caroline ;
Gain, Philippe ;
Ordonez-Varela, John-Richard ;
Massin, Pascale ;
Erginay, Ali ;
Charton, Beatrice ;
Klein, Jean-Claude .
IMAGE ANALYSIS & STEREOLOGY, 2014, 33 (03) :231-234
[9]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[10]   Solving the multiple instance problem with axis-parallel rectangles [J].
Dietterich, TG ;
Lathrop, RH ;
LozanoPerez, T .
ARTIFICIAL INTELLIGENCE, 1997, 89 (1-2) :31-71