MAL: Multi-modal Attention Learning for Tumor Diagnosis Based on Bipartite Graph and Multiple Branches

被引:3
作者
Jiao, Menglei [1 ,6 ]
Liu, Hong [1 ]
Liu, Jianfang [2 ]
Ouyang, Hanqiang [3 ,4 ,5 ]
Wang, Xiangdong [1 ]
Jiang, Liang [3 ,4 ,5 ]
Yuan, Huishu [2 ]
Qian, Yueliang [1 ]
机构
[1] Chinese Acad Sci, Beijing Key Lab Mobile Comp & Pervas Device, Inst Comp Technol, Beijing 100190, Peoples R China
[2] Peking Univ, Hosp 3, Dept Radiol, Beijing 100191, Peoples R China
[3] Peking Univ, Hosp 3, Dept Orthopaed, Beijing 100191, Peoples R China
[4] Engn Res Ctr Bone & Joint Precis Med, Beijing 100191, Peoples R China
[5] Beijing Key Lab Spinal Dis Res, Beijing 100191, Peoples R China
[6] Univ Chinese Acad Sci, Beijing 100086, Peoples R China
来源
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT III | 2022年 / 13433卷
基金
北京市自然科学基金;
关键词
Multi-modal fusion; Attention learning; Tumor diagnosis; SEGMENTATION;
D O I
10.1007/978-3-031-16437-8_17
中图分类号
R445 [影像诊断学];
学科分类号
100207 ;
摘要
The multi-modal fusion of medical images has been widely used in recent years. Most methods focus on images with a single plane, such as the axial plane with different sequences (T1, T2) or different modalities (CT, MRI), rather than multiple planes with or without cross modalities. Further, most methods focus on segmentation or classification at the image or sequence level rather than the patient level. This paper proposes a general and scalable framework named MAL for the classification of benign and malignant tumors at the patient level based on multi-modal attention learning. A bipartite graph is used to model the correlations between different modalities, and then modal fusion is carried out in feature space by attention learning and multi-branch networks. Thereafter, multi-instance learning is adopted to obtain patient-level diagnostic results by considering different modal pairs of patient images to be bags and the edges in the bipartite graph to be instances. The modal and intra-type similarity losses at the patient level are calculated using the feature similarity matrix to encourage the model to extract high-level semantic features with high correlation. The experimental results confirm the effectiveness of MAL on three datasets with respect to different multi-modal fusion tasks, including axial and sagittal MRI, axial CT and sagittal MRI, and T1 and T2 MRI sequences. And the application of MAL can also significantly improve the diagnostic accuracy and efficiency of doctors.
引用
收藏
页码:175 / 185
页数:11
相关论文
共 18 条
  • [1] Fully Convolutional Network for Liver Segmentation and Lesions Detection
    Ben-Cohen, Avi
    Diamant, Idit
    Klang, Eyal
    Amitai, Michal
    Greenspan, Hayit
    [J]. DEEP LEARNING AND DATA LABELING FOR MEDICAL APPLICATIONS, 2016, 10008 : 77 - 85
  • [2] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [3] Christ PF, 2017, Arxiv, DOI arXiv:1702.05970
  • [4] TransMed: Transformers Advance Multi-Modal Medical Image Classification
    Dai, Yin
    Gao, Yifan
    Liu, Fayu
    [J]. DIAGNOSTICS, 2021, 11 (08)
  • [5] Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
  • [6] nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation
    Isensee, Fabian
    Jaeger, Paul F.
    Kohl, Simon A. A.
    Petersen, Jens
    Maier-Hein, Klaus H.
    [J]. NATURE METHODS, 2021, 18 (02) : 203 - +
  • [7] Dual-stream Multiple Instance Learning Network for Whole Slide Image Classification with Self-supervised Contrastive Learning
    Li, Bin
    Li, Yin
    Eliceiri, Kevin W.
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14313 - 14323
  • [8] Liu H., 2022, INSIGHTS IMAGING, V13, P1, DOI 10.1186/S13244-022-01227-2/FIGURES/7
  • [9] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
    Liu, Ze
    Lin, Yutong
    Cao, Yue
    Hu, Han
    Wei, Yixuan
    Zhang, Zheng
    Lin, Stephen
    Guo, Baining
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9992 - 10002
  • [10] Cross-Modal Attention for MRI and Ultrasound Volume Registration
    Song, Xinrui
    Guo, Hengtao
    Xu, Xuanang
    Chao, Hanqing
    Xu, Sheng
    Turkbey, Baris
    Wood, Bradford J.
    Wang, Ge
    Yan, Pingkun
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT IV, 2021, 12904 : 66 - 75