DT-MIL: Deformable Transformer for Multi-instance Learning on Histopathological Image

被引：75

作者：

Li, Hang ^{[1
,2
]}

Yang, Fan ^{[2
]}

Zhao, Yu ^{[2
]}

Xing, Xiaohan ^{[2
,3
]}

Zhang, Jun ^{[2
]}

Gao, Mingxuan ^{[1
,2
]}

Huang, Junzhou ^{[2
]}

Wang, Liansheng ^{[1
]}

Yao, Jianhua ^{[2
]}

机构：

[1] Xiamen Univ, Sch Informat, Xiamen, Peoples R China

[2] Tencent, AI Lab, Shenzhen, Peoples R China

[3] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China

来源：

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT VIII | 2021年 / 12908卷

基金：

国家重点研发计划;

关键词：

Deformable transformer; Multi-instance learning; Key-value attention; Histopathological image analysis; CANCER;

D O I：

10.1007/978-3-030-87237-3_20

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Learning informative representations is crucial for classification and prediction tasks on histopathological images. Due to the huge image size, whole-slide histopathological image analysis is normally addressed with multi-instance learning (MIL) scheme. However, the weakly supervised nature of MIL leads to the challenge of learning an effective whole-slide-level representation. To tackle this issue, we present a novel embedded-space MIL model based on deformable transformer (DT) architecture and convolutional layers, which is termed DT-MIL. The DT architecture enables our MIL model to update each instance feature by globally aggregating instance features in a bag simultaneously and encoding the position context information of instances during bag representation learning. Compared with other state-of-the-art MIL models, our model has the following advantages: (1) generating the bag representation in a fully trainable way, (2) representing the bag with a high-level and nonlinear combination of all instances instead of fixed pooling-based methods (e.g. max pooling and average pooling) or simply attention-based linear aggregation, and (3) encoding the position relationship and context information during bag embedding phase. Besides our proposed DT-MIL, we also develop other possible transformer-based MILs for comparison. Extensive experiments show that our DT-MIL outperforms the state-of-the-art methods and other transformer-based MIL architectures in histopathological image classification and prediction tasks. An open-source implementation of our approach can be found at https://github.com/yfzon/DT-MIL.

引用

页码：206 / 216

页数：11

共 50 条

[41] SubMIL: Discriminative subspaces for multi-instance learning [J].

Yuan, Jiazheng ;

Huang, Xiankai ;

Liu, Hongzhe ;

Li, Bing ;

Xiong, Weihua .

NEUROCOMPUTING, 2016, 173 :1768-1774

[42] Domain transfer multi-instance dictionary learning [J].

Wang, Ke ;

Liu, Jiayong ;

Gonzalez, Daniel .

NEURAL COMPUTING & APPLICATIONS, 2017, 28 :S983-S992

[43] Domain transfer multi-instance dictionary learning [J].

Ke Wang ;

Jiayong Liu ;

Daniel González .

Neural Computing and Applications, 2017, 28 :983-992

[44] Hierarchical Sampling for Multi-Instance Ensemble Learning [J].

Yuan, Hanning ;

Fang, Meng ;

Zhu, Xingquan .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (12) :2900-2905

[45] Multi-modal Multi-instance Learning Using Weakly Correlated Histopathological Images and Tabular Clinical Information [J].

Li, Hang ;

Yang, Fan ;

Xing, Xiaohan ;

Zhao, Yu ;

Zhang, Jun ;

Liu, Yueping ;

Han, Mengxue ;

Huang, Junzhou ;

Wang, Liansheng ;

Yao, Jianhua .

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT VIII, 2021, 12908 :529-539

[46] Multi-Instance Learning with Discriminative Bag Mapping [J].

Wu, Jia ;

Pan, Shirui ;

Zhu, Xingquan ;

Zhang, Chengqi ;

Wu, Xindong .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (06) :1065-1080

[47] Multi-Scale Dynamic Sparse Token Multi-Instance Learning for Pathology Image Classification [J].

Lei, Dajiang ;

Zhang, Yuqi ;

Wang, Haodong ;

Xiong, Xiaomin ;

Xu, Bo ;

Wang, Guoyin .

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2025, 29 (04) :2744-2757

[48] Multi-instance embedding learning with deconfounded instance-level prediction [J].

Zhang, Yu-Xuan ;

Yang, Mei ;

Zhou, Zhengchun ;

Min, Fan .

INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2023, 16 (03) :391-401

[49] Multi-instance embedding learning with deconfounded instance-level prediction [J].

Yu-Xuan Zhang ;

Mei Yang ;

Zhengchun Zhou ;

Fan Min .

International Journal of Data Science and Analytics, 2023, 16 :391-401

[50] An image retrieval method based on multi-instance learning and perturbative glowworm swarm optimization [J].

Chen, T. (c.tao01@mail.scut.edu.cn), 1600, Binary Information Press, Flat F 8th Floor, Block 3, Tanner Garden, 18 Tanner Road, Hong Kong (10) :2951-2959

← 1 2 3 4 5 →