In-context learning enables multimodal large language models to classify cancer pathology images

被引:4
|
作者
Ferber, Dyke [1 ,2 ,3 ]
Woelflein, Georg [4 ]
Wiest, Isabella C. [3 ,5 ]
Ligero, Marta [3 ]
Sainath, Srividhya [3 ]
Ghaffari Laleh, Narmin [3 ]
El Nahhas, Omar S. M. [3 ]
Mueller-Franzes, Gustav [6 ]
Jaeger, Dirk [1 ,2 ]
Truhn, Daniel [6 ]
Kather, Jakob Nikolas [1 ,2 ,3 ,7 ]
机构
[1] Heidelberg Univ Hosp, Natl Ctr Tumor Dis NCT, Heidelberg, Germany
[2] Heidelberg Univ Hosp, Dept Med Oncol, Heidelberg, Germany
[3] Tech Univ Dresden, Else Kroener Fresenius Ctr Digital Hlth, Dresden, Germany
[4] Univ St Andrews, Sch Comp Sci, St Andrews, Scotland
[5] Heidelberg Univ, Med Fac Mannheim, Dept Med 2, Mannheim, Germany
[6] Univ Hosp Aachen, Dept Diagnost & Intervent Radiol, Aachen, Germany
[7] Univ Hosp Dresden, Dept Med 1, Dresden, Germany
关键词
D O I
10.1038/s41467-024-51465-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Medical image classification requires labeled, task-specific datasets which are used to train deep learning networks de novo, or to fine-tune foundation models. However, this process is computationally and technically demanding. In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates. Yet, in-context learning remains underexplored in medical image analysis. Here, we systematically evaluate the model Generative Pretrained Transformer 4 with Vision capabilities (GPT-4V) on cancer image processing with in-context learning on three cancer histopathology tasks of high importance: Classification of tissue subtypes in colorectal cancer, colon polyp subtyping and breast tumor detection in lymph node sections. Our results show that in-context learning is sufficient to match or even outperform specialized neural networks trained for particular tasks, while only requiring a minimal number of samples. In summary, this study demonstrates that large vision language models trained on non-domain specific data can be applied out-of-the box to solve medical image-processing tasks in histopathology. This democratizes access of generalist AI models to medical experts without technical background especially for areas where annotated data is scarce. Medical image classification remains a challenging process in deep learning. Here, the authors evaluate a large vision language foundation model (GPT-4V) with in-context learning for cancer image processing and show that such models can learn from examples and reach performance similar to specialized neural networks while reducing the gap to current state-of-the art pathology foundation models.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] In-Context Learning Unlocked for Diffusion Models
    Wang, Zhendong
    Jiang, Yifan
    Lu, Yadong
    Shen, Yelong
    He, Pengcheng
    Chen, Weizhu
    Wang, Zhangyang
    Zhou, Mingyuan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [22] Show Exemplars and Tell Me What You See: In-Context Learning with Frozen Large Language Models for TextVQA
    Zhang, Yan
    Zeng, Gangyan
    Shen, Huawen
    Ma, Can
    Zhou, Yu
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VII, 2025, 15037 : 231 - 245
  • [23] In-Context Retrieval-Augmented Language Models
    Ram, Ori
    Levine, Yoav
    Dalmedigos, Itay
    Muhlgay, Dor
    Shashua, Amnon
    Leyton-Brown, Kevin
    Shoham, Yoav
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 1316 - 1331
  • [24] On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model
    Shin, Seongjin
    Lee, Sang-Woo
    Ahn, Hwijeen
    Kim, Sungdong
    Kim, HyoungSeok
    Kim, Boseop
    Cho, Kyunghyun
    Lee, Gichang
    Park, Woomyoung
    Ha, Jung-Woo
    Sung, Nako
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5168 - 5186
  • [25] Towards Intent-based Configuration for Network Function Virtualization using In-context Learning in Large Language Models
    Nguyen Van Tu
    Yoo, Jae-Hyoung
    Hong, James Won-Ki
    PROCEEDINGS OF 2024 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, NOMS 2024, 2024,
  • [26] Concept-aware Data Construction Improves In-context Learning of Language Models
    Stefanik, Michal
    Kadlcik, Marek
    Sojka, Petr
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 12335 - 12352
  • [27] Images Speak in Images: A Generalist Painter for In-Context Visual Learning
    Wang, Xinlong
    Wang, Wen
    Cao, Yue
    Shen, Chunhua
    Huang, Tiejun
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6830 - 6839
  • [28] WINODICT: Probing language models for in-context word acquisition
    Eisenschlos, Julian Martin
    Cole, Jeremy R.
    Liu, Fangyu
    Cohen, William W.
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 94 - 102
  • [29] Enhancing In-Context Learning of Large Language Models for Knowledge Graph Reasoning via Rule-and-Reinforce Selected Triples
    Wang, Shaofei
    APPLIED SCIENCES-BASEL, 2025, 15 (03):
  • [30] ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context Learning
    She, Jingyuan Selena
    Potts, Christopher
    Bowman, Samuel R.
    Geiger, Atticus
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1803 - 1821