DT-MIL: Deformable Transformer for Multi-instance Learning on Histopathological Image

被引:75
作者
Li, Hang [1 ,2 ]
Yang, Fan [2 ]
Zhao, Yu [2 ]
Xing, Xiaohan [2 ,3 ]
Zhang, Jun [2 ]
Gao, Mingxuan [1 ,2 ]
Huang, Junzhou [2 ]
Wang, Liansheng [1 ]
Yao, Jianhua [2 ]
机构
[1] Xiamen Univ, Sch Informat, Xiamen, Peoples R China
[2] Tencent, AI Lab, Shenzhen, Peoples R China
[3] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China
来源
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT VIII | 2021年 / 12908卷
基金
国家重点研发计划;
关键词
Deformable transformer; Multi-instance learning; Key-value attention; Histopathological image analysis; CANCER;
D O I
10.1007/978-3-030-87237-3_20
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Learning informative representations is crucial for classification and prediction tasks on histopathological images. Due to the huge image size, whole-slide histopathological image analysis is normally addressed with multi-instance learning (MIL) scheme. However, the weakly supervised nature of MIL leads to the challenge of learning an effective whole-slide-level representation. To tackle this issue, we present a novel embedded-space MIL model based on deformable transformer (DT) architecture and convolutional layers, which is termed DT-MIL. The DT architecture enables our MIL model to update each instance feature by globally aggregating instance features in a bag simultaneously and encoding the position context information of instances during bag representation learning. Compared with other state-of-the-art MIL models, our model has the following advantages: (1) generating the bag representation in a fully trainable way, (2) representing the bag with a high-level and nonlinear combination of all instances instead of fixed pooling-based methods (e.g. max pooling and average pooling) or simply attention-based linear aggregation, and (3) encoding the position relationship and context information during bag embedding phase. Besides our proposed DT-MIL, we also develop other possible transformer-based MILs for comparison. Extensive experiments show that our DT-MIL outperforms the state-of-the-art methods and other transformer-based MIL architectures in histopathological image classification and prediction tasks. An open-source implementation of our approach can be found at https://github.com/yfzon/DT-MIL.
引用
收藏
页码:206 / 216
页数:11
相关论文
共 50 条
[41]   SubMIL: Discriminative subspaces for multi-instance learning [J].
Yuan, Jiazheng ;
Huang, Xiankai ;
Liu, Hongzhe ;
Li, Bing ;
Xiong, Weihua .
NEUROCOMPUTING, 2016, 173 :1768-1774
[42]   Domain transfer multi-instance dictionary learning [J].
Wang, Ke ;
Liu, Jiayong ;
Gonzalez, Daniel .
NEURAL COMPUTING & APPLICATIONS, 2017, 28 :S983-S992
[43]   Domain transfer multi-instance dictionary learning [J].
Ke Wang ;
Jiayong Liu ;
Daniel González .
Neural Computing and Applications, 2017, 28 :983-992
[44]   Hierarchical Sampling for Multi-Instance Ensemble Learning [J].
Yuan, Hanning ;
Fang, Meng ;
Zhu, Xingquan .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (12) :2900-2905
[45]   Multi-modal Multi-instance Learning Using Weakly Correlated Histopathological Images and Tabular Clinical Information [J].
Li, Hang ;
Yang, Fan ;
Xing, Xiaohan ;
Zhao, Yu ;
Zhang, Jun ;
Liu, Yueping ;
Han, Mengxue ;
Huang, Junzhou ;
Wang, Liansheng ;
Yao, Jianhua .
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT VIII, 2021, 12908 :529-539
[46]   Multi-Instance Learning with Discriminative Bag Mapping [J].
Wu, Jia ;
Pan, Shirui ;
Zhu, Xingquan ;
Zhang, Chengqi ;
Wu, Xindong .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (06) :1065-1080
[47]   Multi-Scale Dynamic Sparse Token Multi-Instance Learning for Pathology Image Classification [J].
Lei, Dajiang ;
Zhang, Yuqi ;
Wang, Haodong ;
Xiong, Xiaomin ;
Xu, Bo ;
Wang, Guoyin .
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2025, 29 (04) :2744-2757
[48]   Multi-instance embedding learning with deconfounded instance-level prediction [J].
Zhang, Yu-Xuan ;
Yang, Mei ;
Zhou, Zhengchun ;
Min, Fan .
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2023, 16 (03) :391-401
[49]   Multi-instance embedding learning with deconfounded instance-level prediction [J].
Yu-Xuan Zhang ;
Mei Yang ;
Zhengchun Zhou ;
Fan Min .
International Journal of Data Science and Analytics, 2023, 16 :391-401
[50]   An image retrieval method based on multi-instance learning and perturbative glowworm swarm optimization [J].
Chen, T. (c.tao01@mail.scut.edu.cn), 1600, Binary Information Press, Flat F 8th Floor, Block 3, Tanner Garden, 18 Tanner Road, Hong Kong (10) :2951-2959