Aligning Medical Images with General Knowledge from Large Language Models

被引：0

作者：

Fang, Xiao ^{[1
]}

Lin, Yi ^{[1
]}

Zhang, Dong ^{[2
]}

Cheng, Kwang-Ting ^{[2
]}

Chen, Hao ^{[1
,3
,4
]}

机构：

[1] HKUST, Dept Comp Sci & Engn, Hong Kong, Peoples R China

[2] HKUST, Dept Elect & Comp Engn, Hong Kong, Peoples R China

[3] HKUST, Dept Chem & Biol Engn, Hong Kong, Peoples R China

[4] HKUST Shenzhen Hong Kong Collaborat Innovat Res I, Shenzhen, Peoples R China

来源：

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT X | 2024年 / 15010卷

关键词：

Prompt Learning; Vision-Language Models; Large Language Model; Medical Image Analysis;

D O I：

10.1007/978-3-031-72117-5_6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pre-trained large vision-language models (VLMs) like CLIP have revolutionized visual representation learning using natural language as supervisions, and demonstrated promising generalization ability. In this work, we propose ViP, a novel visual symptom-guided prompt learning framework for medical image analysis, which facilitates general knowledge transfer from CLIP. ViP consists of two key components: a visual symptom generator (VSG) and a dual-prompt network. Specifically, VSG aims to extract explicable visual symptoms from pre-trained large language models, while the dual-prompt network utilizes these visual symptoms to guide the training on two learnable prompt modules, i.e., context prompt and merge prompt, which effectively adapts our framework to medical image analysis via large VLMs. Extensive experimental results demonstrate that ViP can outperform state-of-the-art methods on two challenging datasets. The code is available at https://github.com/xiaofang007/ViP.

引用

页码：57 / 67

页数：11

共 50 条

[21] KoSEL: Knowledge subgraph enhanced large language model for medical question answering
Zeng, Zefan
Cheng, Qing
Hu, Xingchen
Zhuang, Yan
Liu, Xinwang
He, Kunlun
Liu, Zhong
KNOWLEDGE-BASED SYSTEMS, 2025, 309
[22] Updated Primer on Generative Artificial Intelligence and Large Language Models in Medical Imaging for Medical Professionals
Kim, Kiduk
Cho, Kyungjin
Jang, Ryoungwoo
Kyung, Sunggu
Lee, Soyoung
Ham, Sungwon
Choi, Edward
Hong, Gil-Sun
Kim, Namkug
KOREAN JOURNAL OF RADIOLOGY, 2024, 25 (03) : 224 - 242
[23] KnowledgeNavigator: leveraging large language models for enhanced reasoning over knowledge graph
Guo, Tiezheng
Yang, Qingwen
Wang, Chen
Liu, Yanyi
Li, Pan
Tang, Jiawei
Li, Dapeng
Wen, Yingyou
COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (05) : 7063 - 7076
[24] A comprehensive evaluation of large language models in mining gene relations and pathway knowledge
Azam, Muhammad
Chen, Yibo
Arowolo, Micheal Olaolu
Liu, Haowang
Popescu, Mihail
Xu, Dong
QUANTITATIVE BIOLOGY, 2024, 12 (04) : 360 - 374
[25] AIREG: Enhanced Educational Recommender System with Large Language Models and Knowledge Graphs
Fathi, Fatemeh
SEMANTIC WEB: ESWC 2024 SATELLITE EVENTS, PT II, 2025, 15345 : 84 - 93
[26] Enhancing emergency decision-making with knowledge graphs and large language models
Chen, Minze
Tao, Zhenxiang
Tang, Weitong
Qin, Tingxin
Yang, Rui
Zhu, Chunli
INTERNATIONAL JOURNAL OF DISASTER RISK REDUCTION, 2024, 113
[27] Story-to-Images Translation: Leveraging Diffusion Models and Large Language Models for Sequence Image Generation
Kumagai, Haruka
Yamaki, Ryosuke
Naganuma, Hiroki
PROCEEDINGS OF THE 2ND WORKSHOP ON USER-CENTRIC NARRATIVE SUMMARIZATION OF LONG VIDEOS, NARSUM 2023, 2023, : 57 - 63
[28] Prompting Large Language Models with Knowledge-Injection for Knowledge-Based Visual Question Answering
Hu, Zhongjian
Yang, Peng
Liu, Fengyuan
Meng, Yuan
Liu, Xingyu
BIG DATA MINING AND ANALYTICS, 2024, 7 (03): : 843 - 857
[29] Analyzing evaluation methods for large language models in the medical field: a scoping review
Lee, Junbok
Park, Sungkyung
Shin, Jaeyong
Cho, Belong
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)
[30] Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models
Xu, Xuenan
Zhang, Pingyue
Yang, Ming
Zhang, Ji
Wu, Mengyue
INTERSPEECH 2024, 2024, : 4808 - 4812

← 1 2 3 4 5 →