MMMViT: Multiscale multimodal vision transformer for brain tumor segmentation with missing modalities

被引:2
作者
Qiu, Chengjian [1 ]
Song, Yuqing [1 ]
Liu, Yi [1 ]
Zhu, Yan [2 ]
Han, Kai [1 ]
Sheng, Victor S. [3 ]
Liu, Zhe [1 ]
机构
[1] Jiangsu Univ, Sch Comp Sci & Commun Engn, Zhenjiang 212013, Peoples R China
[2] Jiangsu Univ, Affiliated Hosp, Dept Radiol, Zhenjiang 212001, Peoples R China
[3] Texas Tech Univ, Dept Comp Sci, Lubbock, TX USA
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Brain tumor segmentation; Missing modalities; Global multiscale features; Correlation across modalities;
D O I
10.1016/j.bspc.2023.105827
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Accurate segmentation of brain tumors from multimodal MRI sequences is a critical prerequisite for brain tumor diagnosis, prognosis, and surgical treatment. While one or more modalities are often missing in clinical practice, which can collapse most previous methods that rely on all modality data. To deal with this problem, the current state-of-the-art Transformer-related approach directly fuses available modality-specific features to learn a shared latent representation, with the aim of extracting common features that are robust to any combinatorial subset of all modalities. However, it is not trivial to directly learn a shared latent representation due to the diversity of combinatorial subsets of all modalities. Furthermore, correlations across modalities as well as global multiscale features are not exploited in this Transformer-related approach. In this work, we propose a Multiscale Multimodal Vision Transformer (MMMViT), which not only leverages correlations across modalities to decouple the direct fusing procedure into two simple steps but also innovatively fuses local multiscale features as the input of the intra-modal Transformer block to implicitly obtain the global multiscale features to adapt to brain tumors of various sizes. We experiment on the BraTs 2018 dataset for all modalities and various missing-modalities as input, and the results demonstrate that the proposed method achieves the state-of-the-art performance. Code is available at: https://github.com/qiuchengjian/MMMViT.
引用
收藏
页数:10
相关论文
共 38 条
  • [1] Cross-modality synthesis from CT to PET using FCN and GAN networks for improved automated lesion detection
    Ben-Cohen, Avi
    Klang, Eyal
    Raskin, Stephen P.
    Soffer, Shelly
    Ben-Haim, Simona
    Konen, Eli
    Amitai, Michal Marianne
    Greenspan, Hayit
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 78 : 186 - 194
  • [2] Learning With Privileged Multimodal Knowledge for Unimodal Segmentation
    Chen, Cheng
    Dou, Qi
    Jin, Yueming
    Liu, Quande
    Heng, Pheng Ann
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2022, 41 (03) : 621 - 632
  • [3] Robust Multimodal Brain Tumor Segmentation via Feature Disentanglement and Gated Fusion
    Chen, Cheng
    Dou, Qi
    Jin, Yueming
    Chen, Hao
    Qin, Jing
    Pheng-Ann Heng
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT III, 2019, 11766 : 447 - 456
  • [4] Cicek Ozgun, 2016, Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016. 19th International Conference. Proceedings: LNCS 9901, P424, DOI 10.1007/978-3-319-46723-8_49
  • [5] MVFusFra: A Multi-View Dynamic Fusion Framework for Multimodal Brain Tumor Segmentation
    Ding, Yi
    Zheng, Wei
    Geng, Ji
    Qin, Zhen
    Choo, Kim-Kwang Raymond
    Qin, Zhiguang
    Hou, Xiaolin
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (04) : 1570 - 1581
  • [6] Hetero-Modal Variational Encoder-Decoder for Joint Modality Completion and Segmentation
    Dorent, Reuben
    Joutard, Samuel
    Modat, Marc
    Ourselin, Sebastien
    Vercauteren, Tom
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT II, 2019, 11765 : 74 - 82
  • [7] Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
  • [8] Multiscale Vision Transformers
    Fan, Haoqi
    Xiong, Bo
    Mangalam, Karttikeya
    Li, Yanghao
    Yan, Zhicheng
    Malik, Jitendra
    Feichtenhofer, Christoph
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6804 - 6815
  • [9] NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection
    Ghiasi, Golnaz
    Lin, Tsung-Yi
    Le, Quoc V.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7029 - 7038
  • [10] Generative Adversarial Networks
    Goodfellow, Ian
    Pouget-Abadie, Jean
    Mirza, Mehdi
    Xu, Bing
    Warde-Farley, David
    Ozair, Sherjil
    Courville, Aaron
    Bengio, Yoshua
    [J]. COMMUNICATIONS OF THE ACM, 2020, 63 (11) : 139 - 144