Automated multi-modal Transformer network (AMTNet) for 3D medical images segmentation

被引:8
作者
Zheng, Shenhai [1 ,2 ]
Tan, Jiaxin [1 ]
Jiang, Chuangbo [3 ]
Li, Laquan [1 ,3 ]
机构
[1] Chongqing Univ Posts & Telecommun, Coll Comp Sci &Technol, Chongqing 400065, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Image Cognit, Chongqing 400065, Peoples R China
[3] Chongqing Univ Posts & Telecommun, Sch Sci, Chongqing 400065, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
medical image segmentation; Transformer; multi-modal; feature fusion; TUMOR SEGMENTATION;
D O I
10.1088/1361-6560/aca74c
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objective. Over the past years, convolutional neural networks based methods have dominated the field of medical image segmentation. But the main drawback of these methods is that they have difficulty representing long-range dependencies. Recently, the Transformer has demonstrated super performance in computer vision and has also been successfully applied to medical image segmentation because of the self-attention mechanism and long-range dependencies encoding on images. To the best of our knowledge, only a few works focus on cross-modalities of image segmentation using the Transformer. Hence, the main objective of this study was to design, propose and validate a deep learning method to extend the application of Transformer to multi-modality medical image segmentation. Approach. This paper proposes a novel automated multi-modal Transformer network termed AMTNet for 3D medical image segmentation. Especially, the network is a well-modeled U-shaped network architecture where many effective and significant changes have been made in the feature encoding, fusion, and decoding parts. The encoding part comprises 3D embedding, 3D multi-modal Transformer, and 3D Co-learn down-sampling blocks. Symmetrically, the 3D Transformer block, upsampling block, and 3D-expanding blocks are included in the decoding part. In addition, a Transformer-based adaptive channel interleaved Transformer feature fusion module is designed to fully fuse features of different modalities. Main results. We provide a comprehensive experimental analysis of the Prostate and BraTS2021 datasets. The results show that our method achieves an average DSC of 0.907 and 0.851 (0.734 for ET, 0.895 for TC, and 0.924 for WT) on these two datasets, respectively. These values show that AMTNet yielded significant improvements over the state-of-the-art segmentation networks. Significance. The proposed 3D segmentation network exploits complementary features of different modalities during the feature extraction process at multiple scales to increase the 3D feature representations and improve the segmentation efficiency. This powerful network enriches the research of the Transformer to multi-modal medical image segmentation.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] OctopusNet: A Deep Learning Segmentation Network for Multi-modal Medical Images
    Chen, Yu
    Chen, Jiawei
    Wei, Dong
    Li, Yuexiang
    Zheng, Yefeng
    MULTISCALE MULTIMODAL MEDICAL IMAGING, MMMI 2019, 2020, 11977 : 17 - 25
  • [2] Dual-attention transformer-based hybrid network for multi-modal medical image segmentation
    Zhang, Menghui
    Zhang, Yuchen
    Liu, Shuaibing
    Han, Yahui
    Cao, Honggang
    Qiao, Bingbing
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [3] 3D deeply supervised network for automated segmentation of volumetric medical images
    Dou, Qi
    Yu, Lequan
    Chen, Hao
    Jin, Yueming
    Yang, Xin
    Qin, Jing
    Heng, Pheng-Ann
    MEDICAL IMAGE ANALYSIS, 2017, 41 : 40 - 54
  • [4] MFTransNet: A Multi-Modal Fusion with CNN-Transformer Network for Semantic Segmentation of HSR Remote Sensing Images
    He, Shumeng
    Yang, Houqun
    Zhang, Xiaoying
    Li, Xuanyu
    MATHEMATICS, 2023, 11 (03)
  • [5] Multi-modal Transformer for Brain Tumor Segmentation
    Cho, Jihoon
    Park, Jinah
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2022, 2023, 13769 : 138 - 148
  • [6] Multi-Modal Segmentation of 3D Brain Scans Using Neural Networks
    Zopes, Jonathan
    Platscher, Moritz
    Paganucci, Silvio
    Federau, Christian
    FRONTIERS IN NEUROLOGY, 2021, 12
  • [7] Dual-Attention Deep Fusion Network for Multi-modal Medical Image Segmentation
    Zheng, Shenhai
    Ye, Xin
    Tan, Jiaxin
    Yang, Yifei
    Li, Laquan
    FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705
  • [8] A multi-modal and multi-stage fusion enhancement network for segmentation based on OCT and OCTA images
    Quan, Xiongwen
    Hou, Guangyao
    Yin, Wenya
    Zhang, Han
    INFORMATION FUSION, 2025, 113
  • [9] GATR: Transformer Based on Guided Aggregation Decoder for 3D Multi-Modal Detection
    Luo, Yikai
    He, Linyuan
    Ma, Shiping
    Qi, Zisen
    Fan, Zunlin
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 9725 - 9732
  • [10] IMIIN: An inter-modality information interaction network for 3D multi-modal breast tumor segmentation
    Peng, Chengtao
    Zhang, Yue
    Zheng, Jian
    Li, Bin
    Shen, Jun
    Li, Ming
    Liu, Lei
    Qiu, Bensheng
    Chen, Danny Z.
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2022, 95