MAL: Multi-modal Attention Learning for Tumor Diagnosis Based on Bipartite Graph and Multiple Branches

被引：3

作者：

Jiao, Menglei ^{[1
,6
]}

Liu, Hong ^{[1
]}

Liu, Jianfang ^{[2
]}

Ouyang, Hanqiang ^{[3
,4
,5
]}

Wang, Xiangdong ^{[1
]}

Jiang, Liang ^{[3
,4
,5
]}

Yuan, Huishu ^{[2
]}

Qian, Yueliang ^{[1
]}

机构：

[1] Chinese Acad Sci, Beijing Key Lab Mobile Comp & Pervas Device, Inst Comp Technol, Beijing 100190, Peoples R China

[2] Peking Univ, Hosp 3, Dept Radiol, Beijing 100191, Peoples R China

[3] Peking Univ, Hosp 3, Dept Orthopaed, Beijing 100191, Peoples R China

[4] Engn Res Ctr Bone & Joint Precis Med, Beijing 100191, Peoples R China

[5] Beijing Key Lab Spinal Dis Res, Beijing 100191, Peoples R China

[6] Univ Chinese Acad Sci, Beijing 100086, Peoples R China

来源：

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT III | 2022年 / 13433卷

基金：

北京市自然科学基金;

关键词：

Multi-modal fusion; Attention learning; Tumor diagnosis; SEGMENTATION;

D O I：

10.1007/978-3-031-16437-8_17

中图分类号：

R445 [影像诊断学];

学科分类号：

100207 ;

摘要：

The multi-modal fusion of medical images has been widely used in recent years. Most methods focus on images with a single plane, such as the axial plane with different sequences (T1, T2) or different modalities (CT, MRI), rather than multiple planes with or without cross modalities. Further, most methods focus on segmentation or classification at the image or sequence level rather than the patient level. This paper proposes a general and scalable framework named MAL for the classification of benign and malignant tumors at the patient level based on multi-modal attention learning. A bipartite graph is used to model the correlations between different modalities, and then modal fusion is carried out in feature space by attention learning and multi-branch networks. Thereafter, multi-instance learning is adopted to obtain patient-level diagnostic results by considering different modal pairs of patient images to be bags and the edges in the bipartite graph to be instances. The modal and intra-type similarity losses at the patient level are calculated using the feature similarity matrix to encourage the model to extract high-level semantic features with high correlation. The experimental results confirm the effectiveness of MAL on three datasets with respect to different multi-modal fusion tasks, including axial and sagittal MRI, axial CT and sagittal MRI, and T1 and T2 MRI sequences. And the application of MAL can also significantly improve the diagnostic accuracy and efficiency of doctors.

引用

页码：175 / 185

页数：11

共 18 条

[1] Fully Convolutional Network for Liver Segmentation and Lesions Detection
Ben-Cohen, Avi
Diamant, Idit
Klang, Eyal
Amitai, Michal
Greenspan, Hayit
[J]. DEEP LEARNING AND DATA LABELING FOR MEDICAL APPLICATIONS, 2016, 10008 : 77 - 85
[2] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[3] Christ PF, 2017, Arxiv, DOI arXiv:1702.05970
[4] TransMed: Transformers Advance Multi-Modal Medical Image Classification
Dai, Yin
Gao, Yifan
Liu, Fayu
[J]. DIAGNOSTICS, 2021, 11 (08)
[5] Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
[6] nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation
Isensee, Fabian
Jaeger, Paul F.
Kohl, Simon A. A.
Petersen, Jens
Maier-Hein, Klaus H.
[J]. NATURE METHODS, 2021, 18 (02) : 203 - +
[7] Dual-stream Multiple Instance Learning Network for Whole Slide Image Classification with Self-supervised Contrastive Learning
Li, Bin
Li, Yin
Eliceiri, Kevin W.
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14313 - 14323
[8] Liu H., 2022, INSIGHTS IMAGING, V13, P1, DOI 10.1186/S13244-022-01227-2/FIGURES/7
[9] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Liu, Ze
Lin, Yutong
Cao, Yue
Hu, Han
Wei, Yixuan
Zhang, Zheng
Lin, Stephen
Guo, Baining
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9992 - 10002
[10] Cross-Modal Attention for MRI and Ultrasound Volume Registration
Song, Xinrui
Guo, Hengtao
Xu, Xuanang
Chao, Hanqing
Xu, Sheng
Turkbey, Baris
Wood, Bradford J.
Wang, Ge
Yan, Pingkun
[J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT IV, 2021, 12904 : 66 - 75

← 1 2 →