CoD-MIL: Chain-of-Diagnosis Prompting Multiple Instance Learning for Whole Slide Image Classification

被引：0

作者：

Shi, Jiangbo ^{[1
]}

Li, Chen ^{[1
]}

Gong, Tieliang

Wang, Chunbao ^{[2
]}

Fu, Huazhu ^{[3
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Comp Sci & Technol, Xian 710049, Shaanxi, Peoples R China

[2] Xi An Jiao Tong Univ, Affiliated Hosp 1, Dept Pathol, Xian 710061, Shaanxi, Peoples R China

[3] ASTAR, Inst High Performance Comp IHPC, Singapore 138632, Singapore

来源：

IEEE TRANSACTIONS ON MEDICAL IMAGING | 2025年 / 44卷 / 03期

基金：

新加坡国家研究基金会;

关键词：

Pathology; Tumors; Feature extraction; Visualization; Image classification; Training; Electronic mail; Cognition; Cancer; Hospitals; Histopathology; whole slide image analysis; multiple instance learning; vision language model; TRANSFORMER;

D O I：

10.1109/TMI.2024.3485120

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Multiple instance learning (MIL) has emerged as a prominent paradigm for processing the whole slide image with pyramid structure and giga-pixel size in digital pathology. However, existing attention-based MIL methods are primarily trained on the image modality and a pre-defined label set, leading to limited generalization and interpretability. Recently, vision language models (VLM) have achieved promising performance and transferability, offering potential solutions to the limitations of MIL-based methods. Pathological diagnosis is an intricate process that requires pathologists to examine the WSI step-by-step. In the field of natural language process, the chain-of-thought (CoT) prompting method is widely utilized to imitate the human reasoning process. Inspired by the CoT prompt and pathologists' clinic knowledge, we propose a chain-of-diagnosis prompting multiple instance learning (CoD-MIL) framework for whole slide image classification. Specifically, the chain-of-diagnosis text prompt decomposes the complex diagnostic process in WSI into progressive sub-processes from low to high magnification. Additionally, we propose a text-guided contrastive masking module to accurately localize the tumor region by masking the most discriminative instances and introducing the guidance of normal tissue texts in a contrastive way. Extensive experiments conducted on three real-world subtyping datasets demonstrate the effectiveness and superiority of CoD-MIL.

引用

页码：1218 / 1229

页数：12

共 50 条

[1] MULTIPLE INSTANCE LEARNING WITH CRITICAL INSTANCE FOR WHOLE SLIDE IMAGE CLASSIFICATION
Zhou, Yuanpin
Lu, Yao
2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
[2] CaMIL: Causal Multiple Instance Learning for Whole Slide Image Classification
Chen, Kaitao
Sun, Shiliang
Zhao, Jing
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2, 2024, : 1120 - 1128
[3] Rethinking Overfitting of Multiple Instance Learning for Whole Slide Image Classification
Song, Hongjian
Tang, Jie
Xiao, Hongzhao
Hu, Juncheng
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 546 - 551
[4] Multiple Instance Learning with random sampling for Whole Slide Image Classification
Keshvarikhojasteh, H.
Pluim, J. P. W.
Veta, M.
DIGITAL AND COMPUTATIONAL PATHOLOGY, MEDICAL IMAGING 2024, 2024, 12933
[5] DEEP HIERARCHICAL MULTIPLE INSTANCE LEARNING FOR WHOLE SLIDE IMAGE CLASSIFICATION
Zhou, Yuanpin
Lu, Yao
2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
[6] E2-MIL: An explainable and evidential multiple instance learning framework for whole slide image classification
Shi, Jiangbo
Li, Chen
Gong, Tieliang
Fu, Huazhu
MEDICAL IMAGE ANALYSIS, 2024, 97
[7] DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification
Zhu, Wenhui
Chen, Xiwen
Qiu, Peijie
Sotiras, Aristeidis
Razi, Abolfazl
Wang, Yalin
COMPUTER VISION-ECCV 2024, PT XXXVIII, 2025, 15096 : 333 - 351
[8] RoFormer for Position Aware Multiple Instance Learning in Whole Slide Image Classification
Pochet, Etienne
Maroun, Rami
Trullo, Roger
MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2023, PT II, 2024, 14349 : 437 - 446
[9] Learnable Context in Multiple Instance Learning for Whole Slide Image Classification and Segmentation
Huang, Yu-Yuan
Chu, Wei-Ta
JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2024,
[10] Attention-Challenging Multiple Instance Learning for Whole Slide Image Classification
Zhang, Yunlong
Li, Honglin
Sun, Yunxuan
Zheng, Sunyi
Zhu, Chenglu
Yang, Lin
COMPUTER VISION - ECCV 2024, PT LIII, 2025, 15111 : 125 - 143

← 1 2 3 4 5 →