Automated multi-modal Transformer network (AMTNet) for 3D medical images segmentation

被引：8

作者：

Zheng, Shenhai ^{[1
,2
]}

Tan, Jiaxin ^{[1
]}

Jiang, Chuangbo ^{[3
]}

Li, Laquan ^{[1
,3
]}

机构：

[1] Chongqing Univ Posts & Telecommun, Coll Comp Sci &Technol, Chongqing 400065, Peoples R China

[2] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Image Cognit, Chongqing 400065, Peoples R China

[3] Chongqing Univ Posts & Telecommun, Sch Sci, Chongqing 400065, Peoples R China

来源：

PHYSICS IN MEDICINE AND BIOLOGY | 2023年 / 68卷 / 02期

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

medical image segmentation; Transformer; multi-modal; feature fusion; TUMOR SEGMENTATION;

D O I：

10.1088/1361-6560/aca74c

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

Objective. Over the past years, convolutional neural networks based methods have dominated the field of medical image segmentation. But the main drawback of these methods is that they have difficulty representing long-range dependencies. Recently, the Transformer has demonstrated super performance in computer vision and has also been successfully applied to medical image segmentation because of the self-attention mechanism and long-range dependencies encoding on images. To the best of our knowledge, only a few works focus on cross-modalities of image segmentation using the Transformer. Hence, the main objective of this study was to design, propose and validate a deep learning method to extend the application of Transformer to multi-modality medical image segmentation. Approach. This paper proposes a novel automated multi-modal Transformer network termed AMTNet for 3D medical image segmentation. Especially, the network is a well-modeled U-shaped network architecture where many effective and significant changes have been made in the feature encoding, fusion, and decoding parts. The encoding part comprises 3D embedding, 3D multi-modal Transformer, and 3D Co-learn down-sampling blocks. Symmetrically, the 3D Transformer block, upsampling block, and 3D-expanding blocks are included in the decoding part. In addition, a Transformer-based adaptive channel interleaved Transformer feature fusion module is designed to fully fuse features of different modalities. Main results. We provide a comprehensive experimental analysis of the Prostate and BraTS2021 datasets. The results show that our method achieves an average DSC of 0.907 and 0.851 (0.734 for ET, 0.895 for TC, and 0.924 for WT) on these two datasets, respectively. These values show that AMTNet yielded significant improvements over the state-of-the-art segmentation networks. Significance. The proposed 3D segmentation network exploits complementary features of different modalities during the feature extraction process at multiple scales to increase the 3D feature representations and improve the segmentation efficiency. This powerful network enriches the research of the Transformer to multi-modal medical image segmentation.

引用

页数：18

共 50 条

[1] OctopusNet: A Deep Learning Segmentation Network for Multi-modal Medical Images
Chen, Yu
Chen, Jiawei
Wei, Dong
Li, Yuexiang
Zheng, Yefeng
MULTISCALE MULTIMODAL MEDICAL IMAGING, MMMI 2019, 2020, 11977 : 17 - 25
[2] Dual-attention transformer-based hybrid network for multi-modal medical image segmentation
Zhang, Menghui
Zhang, Yuchen
Liu, Shuaibing
Han, Yahui
Cao, Honggang
Qiao, Bingbing
SCIENTIFIC REPORTS, 2024, 14 (01):
[3] 3D deeply supervised network for automated segmentation of volumetric medical images
Dou, Qi
Yu, Lequan
Chen, Hao
Jin, Yueming
Yang, Xin
Qin, Jing
Heng, Pheng-Ann
MEDICAL IMAGE ANALYSIS, 2017, 41 : 40 - 54
[4] MFTransNet: A Multi-Modal Fusion with CNN-Transformer Network for Semantic Segmentation of HSR Remote Sensing Images
He, Shumeng
Yang, Houqun
Zhang, Xiaoying
Li, Xuanyu
MATHEMATICS, 2023, 11 (03)
[5] Multi-modal Transformer for Brain Tumor Segmentation
Cho, Jihoon
Park, Jinah
BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2022, 2023, 13769 : 138 - 148
[6] Multi-Modal Segmentation of 3D Brain Scans Using Neural Networks
Zopes, Jonathan
Platscher, Moritz
Paganucci, Silvio
Federau, Christian
FRONTIERS IN NEUROLOGY, 2021, 12
[7] Dual-Attention Deep Fusion Network for Multi-modal Medical Image Segmentation
Zheng, Shenhai
Ye, Xin
Tan, Jiaxin
Yang, Yifei
Li, Laquan
FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705
[8] A multi-modal and multi-stage fusion enhancement network for segmentation based on OCT and OCTA images
Quan, Xiongwen
Hou, Guangyao
Yin, Wenya
Zhang, Han
INFORMATION FUSION, 2025, 113
[9] GATR: Transformer Based on Guided Aggregation Decoder for 3D Multi-Modal Detection
Luo, Yikai
He, Linyuan
Ma, Shiping
Qi, Zisen
Fan, Zunlin
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 9725 - 9732
[10] IMIIN: An inter-modality information interaction network for 3D multi-modal breast tumor segmentation
Peng, Chengtao
Zhang, Yue
Zheng, Jian
Li, Bin
Shen, Jun
Li, Ming
Liu, Lei
Qiu, Bensheng
Chen, Danny Z.
COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2022, 95

← 1 2 3 4 5 →