Medical Report Generation from Medical Images Using Vision Transformer and Bart Deep Learning Architectures

被引:0
作者
Ucan, Murat [1 ]
Kaya, Buket [2 ]
Kaya, Mehmet [3 ]
Alhajj, Reda [4 ]
机构
[1] Dicle Univ, Dept Comp Technol, Diyarbakir, Turkiye
[2] Firat Univ, Dept Elect & Automat, Elazig, Turkiye
[3] Firat Univ, Dept Comp Engn, Elazig, Turkiye
[4] Univ Calgary, Dept Comp Sci, Calgary, AB, Canada
来源
SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2024, PT IV | 2025年 / 15214卷
关键词
Deep Learning; Vision Transformer; ViT; Bidirectional Autoregressive Transformer; BART; Medical Report Generation; Chest X-rays;
D O I
10.1007/978-3-031-78554-2_17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generating medical reports from medical images using traditional methods is a time-consuming process that is prone to human error and requires experience. Failure to generate fast reports from medical images delays the treatment of patients, and misdiagnosis can lead to adverse conditions that can cause the death of patients. The main objective of this study is to develop a high-performance deep learning model that can autonomously generate medical reports from medical images. The proposed model consists of a Vision Transformer (ViT) encoder and a Bidirectional Autoregressive Transformer (BART) decoder. Training and testing on the model was conducted using images and reports from the Indiana University Chest X-Ray dataset. The developed model is analyzed with measurable parameters and then compared with its competitors in the literature using the same dataset. The proposed Vi-Ba architecture achieved success scores of 0.150, 0.154, 0.274 in bleu-4, meteor and rouge word matching evaluation metrics, respectively. The Vi-Ba model achieved high reporting performance compared to the studies reviewed in the literature. The results show that the proposed architecture can be used by specialized doctors in hospitals to diagnose diseases faster and more accurately. In this way, misdiagnosis and treatments will be reduced and human life will be protected.
引用
收藏
页码:257 / 267
页数:11
相关论文
共 20 条
  • [1] Alfarghaly Omar, 2021, Informatics in Medicine Unlocked, V24, DOI 10.1016/j.imu.2021.100557
  • [2] Banerjee S, 2005, P ACL WORKSH INTR EX, P65, DOI DOI 10.3115/1626355.1626389
  • [3] Automatic detection of cancer metastasis in lymph node using deep learning
    Butun, Ertan
    Ucan, Murat
    Kaya, Mehmet
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 82
  • [4] A hybrid DenseNet121-UNet model for brain tumor segmentation from MR Images
    Cinar, Necip
    Ozcan, Alper
    Kaya, Mehmet
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 76
  • [5] Preparing a collection of radiology examinations for distribution and retrieval
    Demner-Fushman, Dina
    Kohli, Marc D.
    Rosenman, Marc B.
    Shooshan, Sonya E.
    Rodriguez, Laritza
    Antani, Sameer
    Thoma, George R.
    McDonald, Clement J.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2016, 23 (02) : 304 - 310
  • [6] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
  • [7] A Survey on Vision Transformer
    Han, Kai
    Wang, Yunhe
    Chen, Hanting
    Chen, Xinghao
    Guo, Jianyuan
    Liu, Zhenhua
    Tang, Yehui
    Xiao, An
    Xu, Chunjing
    Xu, Yixing
    Yang, Zhaohui
    Zhang, Yiman
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 87 - 110
  • [8] Harzig P, 2019, Arxiv, DOI arXiv:1908.02123
  • [9] Toward deep MRI segmentation for Alzheimer's disease detection
    Helaly, Hadeer A.
    Badawy, Mahmoud
    Haikal, Amira Y.
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (02) : 1047 - 1063
  • [10] Lewis M, 2019, Arxiv, DOI arXiv:1910.13461