A Novel Transformer Model With Multiple Instance Learning for Diabetic Retinopathy Classification

被引：10

作者：

Yang, Yaoming ^{[1
]}

Cai, Zhili ^{[1
]}

Qiu, Shuxia ^{[1
,2
]}

Xu, Peng ^{[1
,2
]}

机构：

[1] China Jiliang Univ, Coll Sci, Hangzhou 310018, Peoples R China

[2] Key Lab Intelligent Mfg Qual Big Data Tracing & An, Hangzhou 310018, Peoples R China

来源：

IEEE ACCESS | 2024年 / 12卷

基金：

中国国家自然科学基金;

关键词：

Vision Transformer; multiple instance learning; diabetic retinopathy; high-resolution fundus retinal images; medical image classification; DISEASE; IMAGES;

D O I：

10.1109/ACCESS.2024.3351473

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Diabetic retinopathy (DR) is an irreversible fundus retinopathy. A deep learning-based auto-mated DR diagnosis system can save diagnostic time. While Transformer has shown superior performance compared to Convolutional Neural Network (CNN), it typically requires pre-training with large amounts of data. Although Transformer-based DR diagnosis method may alleviate the problem of limited performance on small-scale retinal datasets by loading pre-trained weights, the size of input images is restricted to 224 x 224. The resolution of retinal images captured by fundus cameras is much higher than 224 x 224, reducing resolution in training will result in the loss of valuable information. In order to efficiently utilize high-resolution retinal images, a new Transformer model with multiple instance learning (TMIL) is proposed for DR classification. A multiple instance learning approach is firstly applied on the retinal images to segment these high-resolution images into 224 x 224 image patches. Subsequently, Vision Transformer (ViT) is used to extract features from each patch. Then, Global Instance Computing Block (GICB) is designed to calculate the inter-instance features. After introducing global information from GICB, the features are used to output the classification results. When using high-resolution retinal images, TMIL can load pre-trained weights of Transformer without being affected by weight interpolation on model performance. Experimental results using the APTOS dataset and the Messidor-1 dataset demonstrate that TMIL achieves better classification performance and reduces inference time by 62% compared with that directly inputting high-resolution images into ViT. And TMIL shows highest classification accuracy compared with the current state-of-the-art results.

引用

页码：6768 / 6776

页数：9

共 38 条

[1] Automatic Diabetic Retinopathy Grading System Based on Detecting Multiple Retinal Lesions [J].

Abdelmaksoud, Eman ;

El-Sappagh, Shaker ;

Barakat, Sherif ;

Abuhmed, Tamer ;

Elmogy, Mohammed .

IEEE ACCESS, 2021, 9 :15939-15960

[2] Texture Attention Network for Diabetic Retinopathy Classification [J].

Alahmadi, Mohammad D. .

IEEE ACCESS, 2022, 10 :55522-55532

[3] Diabetic Retinopathy Fundus Image Classification and Lesions Localization System Using Deep Learning [J].

Alyoubi, Wejdan L. ;

Abulkhair, Maysoon F. ;

Shalash, Wafaa M. .

SENSORS, 2021, 21 (11)

[4] Multiple instance classification: Review, taxonomy and comparative study [J].

Amores, Jaume .

ARTIFICIAL INTELLIGENCE, 2013, 201 :81-105

[5] Deep learning based computer-aided diagnosis systems for diabetic retinopathy: A survey [J].

Asiri, Norah ;

Hussain, Muhammad ;

Al Adel, Fadwa ;

Alzaidi, Nazih .

ARTIFICIAL INTELLIGENCE IN MEDICINE, 2019, 99

[6] Multiple instance learning with bag dissimilarities [J].

Cheplygina, Veronika ;

Tax, David M. J. ;

Loog, Marco .

PATTERN RECOGNITION, 2015, 48 (01) :264-275

[7] IDF Diabetes Atlas: Global estimates of diabetes prevalence for 2017 and projections for 2045 [J].

Cho, N. H. ;

Shaw, J. E. ;

Karuranga, S. ;

Huang, Y. ;

Fernandes, J. D. da Rocha ;

Ohlrogge, A. W. ;

Malanda, B. .

DIABETES RESEARCH AND CLINICAL PRACTICE, 2018, 138 :271-281

[8] FEEDBACK ON A PUBLICLY DISTRIBUTED IMAGE DATABASE: THE MESSIDOR DATABASE [J].

Decenciere, Etienne ;

Zhang, Xiwei ;

Cazuguel, Guy ;

Lay, Bruno ;

Cochener, Beatrice ;

Trone, Caroline ;

Gain, Philippe ;

Ordonez-Varela, John-Richard ;

Massin, Pascale ;

Erginay, Ali ;

Charton, Beatrice ;

Klein, Jean-Claude .

IMAGE ANALYSIS & STEREOLOGY, 2014, 33 (03) :231-234

[9]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[10] Solving the multiple instance problem with axis-parallel rectangles [J].

Dietterich, TG ;

Lathrop, RH ;

LozanoPerez, T .

ARTIFICIAL INTELLIGENCE, 1997, 89 (1-2) :31-71

← 1 2 3 4 →