DT-MIL: Deformable Transformer for Multi-instance Learning on Histopathological Image

被引：74

作者：

Li, Hang ^{[1
,2
]}

Yang, Fan ^{[2
]}

Zhao, Yu ^{[2
]}

Xing, Xiaohan ^{[2
,3
]}

Zhang, Jun ^{[2
]}

Gao, Mingxuan ^{[1
,2
]}

Huang, Junzhou ^{[2
]}

Wang, Liansheng ^{[1
]}

Yao, Jianhua ^{[2
]}

机构：

[1] Xiamen Univ, Sch Informat, Xiamen, Peoples R China

[2] Tencent, AI Lab, Shenzhen, Peoples R China

[3] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China

来源：

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT VIII | 2021年 / 12908卷

基金：

国家重点研发计划;

关键词：

Deformable transformer; Multi-instance learning; Key-value attention; Histopathological image analysis; CANCER;

D O I：

10.1007/978-3-030-87237-3_20

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Learning informative representations is crucial for classification and prediction tasks on histopathological images. Due to the huge image size, whole-slide histopathological image analysis is normally addressed with multi-instance learning (MIL) scheme. However, the weakly supervised nature of MIL leads to the challenge of learning an effective whole-slide-level representation. To tackle this issue, we present a novel embedded-space MIL model based on deformable transformer (DT) architecture and convolutional layers, which is termed DT-MIL. The DT architecture enables our MIL model to update each instance feature by globally aggregating instance features in a bag simultaneously and encoding the position context information of instances during bag representation learning. Compared with other state-of-the-art MIL models, our model has the following advantages: (1) generating the bag representation in a fully trainable way, (2) representing the bag with a high-level and nonlinear combination of all instances instead of fixed pooling-based methods (e.g. max pooling and average pooling) or simply attention-based linear aggregation, and (3) encoding the position relationship and context information during bag embedding phase. Besides our proposed DT-MIL, we also develop other possible transformer-based MILs for comparison. Extensive experiments show that our DT-MIL outperforms the state-of-the-art methods and other transformer-based MIL architectures in histopathological image classification and prediction tasks. An open-source implementation of our approach can be found at https://github.com/yfzon/DT-MIL.

引用

页码：206 / 216

页数：11

共 50 条

[21] Feature Selection in Multi-instance Learning [J].

Zhang, Chun-Hua ;

Tan, Jun-Yan ;

Deng, Nai-Yang .

OPERATIONS RESEARCH AND ITS APPLICATIONS, 2010, 12 :462-+

[22] Feature selection in multi-instance learning [J].

Gan, Rui ;

Yin, Jian .

NEURAL COMPUTING & APPLICATIONS, 2013, 23 (3-4) :907-912

[23] Predicting Histopathological Findings of Gastric Cancer via Deep Generalized Multi-instance Learning [J].

Fang, Mengjie ;

Zhang, Wenjuan ;

Dong, Di ;

Zhou, Junlin ;

Tian, Jie .

MEDICAL IMAGING 2019: IMAGE PROCESSING, 2019, 10949

[24] GCN-based MIL: multi-instance learning utilizing structural relationships among instances [J].

Ma, Yangling ;

Luo, Yixin ;

Yang, Zhouwang .

SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (6-7) :5549-5561

[25] Multi-scale multi-instance contrastive learning for whole slide image classification [J].

Zhang, Jianan ;

Hao, Fang ;

Liu, Xueyu ;

Yao, Shupei ;

Wu, Yongfei ;

Li, Ming ;

Zheng, Wen .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 138

[26] Multi-SVM Multi-instance Learning for Object-Based Image Retrieval [J].

Li, Fei ;

Liu, Rujie ;

Baba, Takayuki .

COMPUTER ANALYSIS OF IMAGES AND PATTERNS, PT I, 2013, 8047 :37-44

[27] Study on image retrieval system base on multi-objective and multi-instance learning [J].

Chen, Ke ;

Peng, Zhiping ;

Ke, Wende .

International Journal of Wireless and Mobile Computing, 2013, 6 (02) :158-164

[28] Multi-instance learning based on representative instance and feature mapping [J].

Wang, Xingqi ;

Wei, Dan ;

Cheng, Hui ;

Fang, Jinglong .

NEUROCOMPUTING, 2016, 216 :790-796

[29] Instance Explainable Multi-instance Learning for ROI of Various Data [J].

Zhao, Xu ;

Wang, Zihao ;

Zhang, Yong ;

Xing, Chunxiao .

DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT II, 2020, 12113 :107-124

[30] HMIL: Hierarchical Multi-Instance Learning for Fine-Grained Whole Slide Image Classification [J].

Jin, Cheng ;

Luo, Luyang ;

Lin, Huangjing ;

Hou, Jun ;

Chen, Hao .

IEEE TRANSACTIONS ON MEDICAL IMAGING, 2025, 44 (04) :1796-1808

← 1 2 3 4 5 →