A Multimodal Transfer Learning Approach Using PubMedCLIP for Medical Image Classification

被引:3
作者
Dao, Hong N. [1 ]
Nguyen, Tuyen [1 ,2 ]
Mugisha, Cherubin [1 ]
Paik, Incheon [1 ]
机构
[1] Univ Aizu, Dept Comp & Informat Syst, Aizu Wakamatsu, Fukushima 9658580, Japan
[2] Univ Technol Sydney, Sch Comp Sci, Ultimo, NSW 2007, Australia
关键词
Biomedical imaging; Feature extraction; Image classification; Task analysis; Training; Transfer learning; Multimodal sensors; Classification algorithms; Pre-trained model; medical image; classification task; contrastive language-image pre-training; feature fusion; multimodal model; prompt engineering; CONVOLUTIONAL NEURAL-NETWORKS;
D O I
10.1109/ACCESS.2024.3401777
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Medical image data often face the problem of data scarcity and costly annotation processes. To overcome this, our study introduces a novel transfer learning method for medical image classification. We present a multimodal learning framework that incorporates the pre-trained PubMedCLIP model and multimodal feature fusion. Prompts of different complexities are combined with images as inputs to the proposed model. Our findings demonstrate that this approach significantly enhances image classification tasks while reducing the burden of annotation costs. Our study underscores the potential of PubMedCLIP in revolutionizing medical image analysis through its prompt-based approach and showcases the value of multi-modality for training robust models in healthcare. Code is available at:https://github.com/HongJapan/MTL_prompt_medical.git.
引用
收藏
页码:75496 / 75507
页数:12
相关论文
共 61 条
[1]   Recognition of peripheral blood cell images using convolutional neural networks [J].
Acevedo, Andrea ;
Alferez, Santiago ;
Merino, Anna ;
Puigvi, Laura ;
Rodellar, Jose .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2019, 180
[2]   Dataset of breast ultrasound images [J].
Al-Dhabyani, Walid ;
Gomaa, Mohammed ;
Khaled, Hussien ;
Fahmy, Aly .
DATA IN BRIEF, 2020, 28
[3]   The Unreasonable Effectiveness of CLIP Features for Image Captioning: An Experimental Analysis [J].
Barraco, Manuele ;
Cornia, Marcella ;
Cascianelli, Silvia ;
Baraldi, Lorenzo ;
Cucchiara, Rita .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, :4661-4669
[4]   MUTAN: Multimodal Tucker Fusion for Visual Question Answering [J].
Ben-younes, Hedi ;
Cadene, Remi ;
Cord, Matthieu ;
Thome, Nicolas .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2631-2639
[5]  
Chauhan Geeticka, 2020, Med Image Comput Comput Assist Interv, V12262, P529, DOI 10.1007/978-3-030-59713-9_51
[6]   BCNet: A Deep Learning Computer-Aided Diagnosis Framework for Human Peripheral Blood Cell Identification [J].
Chola, Channabasava ;
Muaad, Abdullah Y. ;
Bin Heyat, Belal ;
Benifa, J. V. Bibal ;
Naji, Wadeea R. ;
Hemachandran, K. ;
Mahmoud, Noha F. ;
Samee, Nagwan Abdel ;
Al-Antari, Mugahed A. ;
Kadah, Yasser M. ;
Kim, Tae-Seong .
DIAGNOSTICS, 2022, 12 (11)
[7]   CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification [J].
Conde, Marcos, V ;
Turgutlu, Kerem .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, :3951-3955
[8]  
Dao I., 2023, P IEEE 16 INT S DEC, P1
[9]  
Dao T., 2023, P IEEE INT C CONS EL, P1
[10]  
Dao T. N., 2022, P IEEE INT C CONS EL, P1