Cnn-trans model: A parallel dual-branch network for fundus image classification

被引:2
作者
Liu, Shuxian [1 ]
Wang, Wei [2 ]
Deng, Le [1 ]
Xu, Huan [1 ]
机构
[1] Xinjiang Univ, Sch Informat Sci & Engn, Urumqi 830017, Peoples R China
[2] Xinjiang Teachers Coll, Sch Informat Sci & Technol, Urumqi 830043, Peoples R China
基金
中国国家自然科学基金;
关键词
Fundus image classification; Parallel dual branch; Attention mechanism; Feature fusion; CNN;
D O I
10.1016/j.bspc.2024.106621
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
The existence of fundus diseases not only endangers people's vision, but also brings serious economic burden to the society. Fundus images are an objective and standard basis for the diagnosis of fundus diseases. With the continuous advancement of computer science, deep learning methods dominated by convolutional neural networks (CNN) have been widely used in fundus image classification. However, the current CNN-based fundus image classification research still has a lot of room for improvement: CNN cannot effectively avoid the interference of repeated background information and has limited ability to model the whole world. In response to the above findings, this paper proposes the CNN-Trans model. The CNN-Trans model is a parallel dual-branch network, which is the two branches of CNN-LSTM and Vision Transform (ViT). The CNN-LSTM branch uses Xception after transfer learning. As the original feature extractor, LSTM is responsible for dealing with the gradient disappearance problem in neural network iterations before the classification head, and then introduces a new type of lightweight attention mechanism between Xception and LSTM: Coordinate Attention, so as to emphasize the key information related to classification and suppress the less useful repeated background information; while the self-attention mechanism in the ViT branch is not limited by local interactions, it can establish long-distance dependence on the target and extract global features. Finally, the concatenation (Concat) operation is used to fuse the features of the two branches. The local features extracted by the CNN-LSTM branch and the global features extracted by the ViT branch form complementary advantages. After feature fusion, more comprehensive image feature information is sent to the to the classification layer. Finally, after a large number of experimental tests and comparisons, the results show that: the CNN-Trans model achieved an accuracy of 80.68% on the fundus image classification task, and the CNN-Trans model has a classification that is comparable to the state-of-the-art methods. performance..
引用
收藏
页数:15
相关论文
共 37 条
  • [1] Ali M.A., 2024, Intelligent Systems with Applications
  • [2] [Anonymous], 2018, Signal processing and machine learning for biomedical big data
  • [3] Balaji J.J., 2020, Ophthalmic Technologies XXX, V11218, P86
  • [4] Diabetic Retinopathy Detection from Fundus Images of the Eye Using Hybrid Deep Learning Features
    Butt, Muhammad Mohsin
    Iskandar, D. N. F. Awang
    Abdelhamid, Sherif E.
    Latif, Ghazanfar
    Alghazo, Runna
    [J]. DIAGNOSTICS, 2022, 12 (07)
  • [5] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [6] Carion N, 2020, Arxiv, DOI [arXiv:2005.12872, 10.48550/arxiv.2005.12872]
  • [7] Classification of Fundus Images Based on Deep Learning for Detecting Eye Diseases
    Chea, Nakhim
    Nam, Yunyoung
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 67 (01): : 411 - 426
  • [8] Chen M, 2020, PR MACH LEARN RES, V119
  • [9] Cheng J., 2016, EMNLP, P551, DOI [DOI 10.18653/V1/D16-1053, 10.18653/v1/D16-1053]
  • [10] Multi-categorical deep learning neural network to classify retinal images: A pilot study employing small database
    Choi, Joon Yul
    Yoo, Tae Keun
    Seo, Jeong Gi
    Kwak, Jiyong
    Um, Terry Taewoong
    Rim, Tyler Hyungtaek
    [J]. PLOS ONE, 2017, 12 (11):