In-context learning enables multimodal large language models to classify cancer pathology images

被引:4
|
作者
Ferber, Dyke [1 ,2 ,3 ]
Woelflein, Georg [4 ]
Wiest, Isabella C. [3 ,5 ]
Ligero, Marta [3 ]
Sainath, Srividhya [3 ]
Ghaffari Laleh, Narmin [3 ]
El Nahhas, Omar S. M. [3 ]
Mueller-Franzes, Gustav [6 ]
Jaeger, Dirk [1 ,2 ]
Truhn, Daniel [6 ]
Kather, Jakob Nikolas [1 ,2 ,3 ,7 ]
机构
[1] Heidelberg Univ Hosp, Natl Ctr Tumor Dis NCT, Heidelberg, Germany
[2] Heidelberg Univ Hosp, Dept Med Oncol, Heidelberg, Germany
[3] Tech Univ Dresden, Else Kroener Fresenius Ctr Digital Hlth, Dresden, Germany
[4] Univ St Andrews, Sch Comp Sci, St Andrews, Scotland
[5] Heidelberg Univ, Med Fac Mannheim, Dept Med 2, Mannheim, Germany
[6] Univ Hosp Aachen, Dept Diagnost & Intervent Radiol, Aachen, Germany
[7] Univ Hosp Dresden, Dept Med 1, Dresden, Germany
关键词
D O I
10.1038/s41467-024-51465-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Medical image classification requires labeled, task-specific datasets which are used to train deep learning networks de novo, or to fine-tune foundation models. However, this process is computationally and technically demanding. In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates. Yet, in-context learning remains underexplored in medical image analysis. Here, we systematically evaluate the model Generative Pretrained Transformer 4 with Vision capabilities (GPT-4V) on cancer image processing with in-context learning on three cancer histopathology tasks of high importance: Classification of tissue subtypes in colorectal cancer, colon polyp subtyping and breast tumor detection in lymph node sections. Our results show that in-context learning is sufficient to match or even outperform specialized neural networks trained for particular tasks, while only requiring a minimal number of samples. In summary, this study demonstrates that large vision language models trained on non-domain specific data can be applied out-of-the box to solve medical image-processing tasks in histopathology. This democratizes access of generalist AI models to medical experts without technical background especially for areas where annotated data is scarce. Medical image classification remains a challenging process in deep learning. Here, the authors evaluate a large vision language foundation model (GPT-4V) with in-context learning for cancer image processing and show that such models can learn from examples and reach performance similar to specialized neural networks while reducing the gap to current state-of-the art pathology foundation models.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Cascade Large Language Model via In-Context Learning for Depression Detection on Chinese Social Media
    Zheng, Tong
    Guo, Yanrong
    Hong, Richang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT 1, 2025, 15031 : 353 - 366
  • [42] Large Language Model Cascades and Persona-Based In-Context Learning for Multilingual Sexism Detection
    Tian, Lin
    Huang, Nannan
    Zhang, Xiuzhen
    EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION, PT I, CLEF 2024, 2024, 14958 : 254 - 265
  • [43] Structured State Space Models for In-Context Reinforcement Learning
    Lu, Chris
    Schroecker, Yannick
    Gu, Albert
    Parisotto, Emilio
    Foerster, Jakob
    Singh, Satinder
    Behbahani, Feryal
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [44] In-Context Analogical Reasoning with Pre-Trained Language Models
    Hu, Xiaoyang
    Storks, Shane
    Lewis, Richard L.
    Chai, Joyce
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1953 - 1969
  • [45] Generating Images with Multimodal Language Models
    Koh, Jing Yu
    Fried, Daniel
    Salakhutdinov, Ruslan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [46] Meta-learning via Language Model In-context Tuning
    Chen, Yanda
    Zhong, Ruiqi
    Zha, Sheng
    Karypis, George
    He, He
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 719 - 730
  • [47] Cultural Understanding Using In-context Learning and Masked Language Modeling
    Qian, Ming
    Newton, Charles
    Qian, Davis
    HCI INTERNATIONAL 2021 - LATE BREAKING PAPERS: MULTIMODALITY, EXTENDED REALITY, AND ARTIFICIAL INTELLIGENCE, 2021, 13095 : 500 - 508
  • [48] Investigating large language model (LLM) performance using in-context learning (ICL) for interpretation of ESMO and NCCN guidelines for lung cancer
    Iivanainen, Sanna
    Lagus, Jarkko
    Viertolahti, Henri
    Sippola, Lauri
    Koivunen, Jussi
    JOURNAL OF CLINICAL ONCOLOGY, 2024, 42 (16)
  • [49] Cloud-Device Collaborative Learning for Multimodal Large Language Models
    Wang, Guanqun
    Chen, Jiaming
    Liu, Chenxuan
    Zhang, Yuan
    Ma, Junpeng
    Wei, Xinyu
    Zhang, Kevin
    Chong, Maurice
    Zhang, Renrui
    Liu, Yijiang
    Zhang, Shanghang
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 12646 - 12655
  • [50] Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking
    Chen, Derek
    Qian, Kun
    Yu, Zhou
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1551 - 1564