A Novel Transformer Model With Multiple Instance Learning for Diabetic Retinopathy Classification

被引:4
|
作者
Yang, Yaoming [1 ]
Cai, Zhili [1 ]
Qiu, Shuxia [1 ,2 ]
Xu, Peng [1 ,2 ]
机构
[1] China Jiliang Univ, Coll Sci, Hangzhou 310018, Peoples R China
[2] Key Lab Intelligent Mfg Qual Big Data Tracing & An, Hangzhou 310018, Peoples R China
基金
中国国家自然科学基金;
关键词
Vision Transformer; multiple instance learning; diabetic retinopathy; high-resolution fundus retinal images; medical image classification; DISEASE; IMAGES;
D O I
10.1109/ACCESS.2024.3351473
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Diabetic retinopathy (DR) is an irreversible fundus retinopathy. A deep learning-based auto-mated DR diagnosis system can save diagnostic time. While Transformer has shown superior performance compared to Convolutional Neural Network (CNN), it typically requires pre-training with large amounts of data. Although Transformer-based DR diagnosis method may alleviate the problem of limited performance on small-scale retinal datasets by loading pre-trained weights, the size of input images is restricted to 224 x 224. The resolution of retinal images captured by fundus cameras is much higher than 224 x 224, reducing resolution in training will result in the loss of valuable information. In order to efficiently utilize high-resolution retinal images, a new Transformer model with multiple instance learning (TMIL) is proposed for DR classification. A multiple instance learning approach is firstly applied on the retinal images to segment these high-resolution images into 224 x 224 image patches. Subsequently, Vision Transformer (ViT) is used to extract features from each patch. Then, Global Instance Computing Block (GICB) is designed to calculate the inter-instance features. After introducing global information from GICB, the features are used to output the classification results. When using high-resolution retinal images, TMIL can load pre-trained weights of Transformer without being affected by weight interpolation on model performance. Experimental results using the APTOS dataset and the Messidor-1 dataset demonstrate that TMIL achieves better classification performance and reduces inference time by 62% compared with that directly inputting high-resolution images into ViT. And TMIL shows highest classification accuracy compared with the current state-of-the-art results.
引用
收藏
页码:6768 / 6776
页数:9
相关论文
共 50 条
  • [41] Deep learning CS-ResNet-101 model for diabetic retinopathy classification
    Suo, Yaohong
    He, Zhaokun
    Liu, Yicun
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 97
  • [42] Multiple Instance Neuroimage Transformer
    Singla, Ayush
    Zhao, Qingyu
    Do, Daniel K.
    Zhou, Yuyin
    Pohl, Kilian M.
    Adeli, Ehsan
    PREDICTIVE INTELLIGENCE IN MEDICINE (PRIME 2022), 2022, 13564 : 36 - 48
  • [43] Hybrid Model Structure for Diabetic Retinopathy Classification
    Liu, Hao
    Yue, Keqiang
    Cheng, Siyi
    Pan, Chengming
    Sun, Jie
    Li, Wenjun
    JOURNAL OF HEALTHCARE ENGINEERING, 2020, 2020 (2020)
  • [44] Multiple Convolutional Neural Networks for Diabetic Retinopathy Classification
    Schweisthal, Brigitte
    Lascu, Mihaela
    2021 INTERNATIONAL CONFERENCE ON E-HEALTH AND BIOENGINEERING (EHB 2021), 9TH EDITION, 2021,
  • [45] Multiple Instance Learning for Classification of Human Behavior Observations
    Katsamanis, Athanasios
    Gibson, James
    Black, Matthew P.
    Narayanan, Shrikanth S.
    AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, PT I, 2011, 6974 : 145 - 154
  • [46] Multiple instance learning for classification of dementia in brain MRI
    Tong, Tong
    Wolz, Robin
    Ga, Qinquan
    Guerrero, Ricardo
    Hajnal, Joseph V.
    Rueckert, Daniel
    MEDICAL IMAGE ANALYSIS, 2014, 18 (05) : 808 - 818
  • [47] Multiple Instance Learning for Classification of Dementia in Brain MRI
    Tong, Tong
    Wolz, Robin
    Gao, Qinquan
    Hajnal, Joseph V.
    Rueckert, Daniel
    MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2013, PT II, 2013, 8150 : 599 - 606
  • [48] RATE DISTORTION MULTIPLE INSTANCE LEARNING FOR IMAGE CLASSIFICATION
    Wang, Yingying
    Zhang, Chun
    Wang, Zhihua
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 3235 - 3238
  • [49] Supervised aggregated feature learning for multiple instance classification
    Langone, Rocco
    Suykens, Johan A. K.
    INFORMATION SCIENCES, 2017, 375 : 234 - 245
  • [50] MIST: multiple instance learning network based on Swin Transformer for whole slide image classification of colorectal adenomas
    Cai, Hongbin
    Feng, Xiaobing
    Yin, Ruomeng
    Zhao, Youcai
    Guo, Lingchuan
    Fan, Xiangshan
    Liao, Jun
    JOURNAL OF PATHOLOGY, 2023, 259 (02): : 125 - 135