Comparison of Multi-Modal Large Language Models with Deep Learning Models for Medical Image Classification

被引:0
作者
Than, Joel Chia Ming [1 ]
Vong, Wan Tze [1 ]
Yong, Kelvin Sheng Chek [1 ]
机构
[1] Swinburne Univ Technol, Sch Informat Comp & Technol, Sarawak Campus, Kuching, Malaysia
来源
2024 IEEE 8TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING APPLICATIONS, ICSIPA | 2024年
关键词
LLM; multi-modal; deep learning; classification; image;
D O I
10.1109/ICSIPA62061.2024.10687159
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, the advancement of large language models (LLMs) such as GPT-4 and Gemini has opened new avenues for artificial intelligence applications in various domains, including medical image classification. This study aims to compare the performance of multi-modal LLMs with state-of-the-art deep learning networks in classifying tumour and non-tumour images. The performance of four multi-modal LLMs and four conventional deep learning methods were evaluated using several performance measures. The results demonstrate the strengths and limitations of both approaches, providing insights into their applicability and potential integration in clinical practice. Gemini 1.5 Pro performs the best out of the eight models evaluated. This comparison underscores the evolving role of AI in enhancing diagnostic accuracy and supporting medical professionals in disease detection especially when training data is scarce.
引用
收藏
页数:5
相关论文
共 22 条
  • [1] Amar S., 2024, Google Blog
  • [2] Bhuvaji S., 2023, Brain tumor classification (mri), DOI DOI 10.34740/KAGGLE/DSV/1183165
  • [3] Brown TB, 2020, ADV NEUR IN, V33
  • [4] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [5] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
  • [6] Conv-ViT: A Convolution and Vision Transformer-Based Hybrid Feature Extraction Method for Retinal Disease Detection
    Dutta, Pramit
    Sathi, Khaleda Akther
    Hossain, Md. Azad
    Dewan, M. Ali Akber
    [J]. JOURNAL OF IMAGING, 2023, 9 (07)
  • [7] Densely Connected Convolutional Networks
    Huang, Gao
    Liu, Zhuang
    van der Maaten, Laurens
    Weinberger, Kilian Q.
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2261 - 2269
  • [8] Meningioma: A Review of Clinicopathological and Molecular Aspects
    Huntoon, Kristin
    Toland, Angus Martin Shaw
    Dahiya, Sonika
    [J]. FRONTIERS IN ONCOLOGY, 2020, 10
  • [9] Deep learning
    LeCun, Yann
    Bengio, Yoshua
    Hinton, Geoffrey
    [J]. NATURE, 2015, 521 (7553) : 436 - 444
  • [10] A survey on deep learning in medical image analysis
    Litjens, Geert
    Kooi, Thijs
    Bejnordi, Babak Ehteshami
    Setio, Arnaud Arindra Adiyoso
    Ciompi, Francesco
    Ghafoorian, Mohsen
    van der Laak, Jeroen A. W. M.
    van Ginneken, Bram
    Sanchez, Clara I.
    [J]. MEDICAL IMAGE ANALYSIS, 2017, 42 : 60 - 88