Comparison of Multi-Modal Large Language Models with Deep Learning Models for Medical Image Classification

被引：1

作者：

Than, Joel Chia Ming ^{[1
]}

Vong, Wan Tze ^{[1
]}

Yong, Kelvin Sheng Chek ^{[1
]}

机构：

[1] Swinburne Univ Technol, Sch Informat Comp & Technol, Sarawak Campus, Kuching, Malaysia

来源：

2024 IEEE 8TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING APPLICATIONS, ICSIPA | 2024年

关键词：

LLM; multi-modal; deep learning; classification; image;

D O I：

10.1109/ICSIPA62061.2024.10687159

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, the advancement of large language models (LLMs) such as GPT-4 and Gemini has opened new avenues for artificial intelligence applications in various domains, including medical image classification. This study aims to compare the performance of multi-modal LLMs with state-of-the-art deep learning networks in classifying tumour and non-tumour images. The performance of four multi-modal LLMs and four conventional deep learning methods were evaluated using several performance measures. The results demonstrate the strengths and limitations of both approaches, providing insights into their applicability and potential integration in clinical practice. Gemini 1.5 Pro performs the best out of the eight models evaluated. This comparison underscores the evolving role of AI in enhancing diagnostic accuracy and supporting medical professionals in disease detection especially when training data is scarce.

引用

页数：5

共 22 条

[1]

Amar S., 2024, Google Blog

[2]

Bhuvaji S., 2020, Kaggle, DOI DOI 10.34740/KAGGLE/DSV/1183165

[3]

Brown TB, 2020, ADV NEUR IN, V33

[4]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[5]

Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929

[6] Conv-ViT: A Convolution and Vision Transformer-Based Hybrid Feature Extraction Method for Retinal Disease Detection [J].

Dutta, Pramit ;

Sathi, Khaleda Akther ;

Hossain, Md. Azad ;

Dewan, M. Ali Akber .

JOURNAL OF IMAGING, 2023, 9 (07)

[7] Densely Connected Convolutional Networks [J].

Huang, Gao ;

Liu, Zhuang ;

van der Maaten, Laurens ;

Weinberger, Kilian Q. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269

[8] Meningioma: A Review of Clinicopathological and Molecular Aspects [J].

Huntoon, Kristin ;

Toland, Angus Martin Shaw ;

Dahiya, Sonika .

FRONTIERS IN ONCOLOGY, 2020, 10

[9] Deep learning [J].

LeCun, Yann ;

Bengio, Yoshua ;

Hinton, Geoffrey .

NATURE, 2015, 521 (7553) :436-444

[10] A survey on deep learning in medical image analysis [J].

Litjens, Geert ;

Kooi, Thijs ;

Bejnordi, Babak Ehteshami ;

Setio, Arnaud Arindra Adiyoso ;

Ciompi, Francesco ;

Ghafoorian, Mohsen ;

van der Laak, Jeroen A. W. M. ;

van Ginneken, Bram ;

Sanchez, Clara I. .

MEDICAL IMAGE ANALYSIS, 2017, 42 :60-88

← 1 2 3 →