A parallelly contextual convolutional transformer for medical image segmentation

被引:7
作者
Feng, Yuncong [1 ,2 ,3 ]
Su, Jianyu [1 ]
Zheng, Jian [4 ]
Zheng, Yupeng [5 ]
Zhang, Xiaoli [3 ,6 ]
机构
[1] Changchun Univ Technol, Coll Comp Sci & Engn, Changchun 130012, Jilin, Peoples R China
[2] Changchun Univ Technol, Artificial Intelligence Res Inst, Changchun 130012, Jilin, Peoples R China
[3] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Jilin, Peoples R China
[4] Southeast Univ, Natl Mobile Commun Res Lab, Nanjing 211189, Jiangsu, Peoples R China
[5] Lihuili Hosp, Ningbo Med Ctr, Dept Colon Anorectal Surg, Ningbo 315100, Zhejiang, Peoples R China
[6] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Jilin, Peoples R China
关键词
Medical image segmentation; Deep learning; Transformer; Convolutional neural networks; Parallel architecture; U-NET;
D O I
10.1016/j.bspc.2024.106674
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Hybrid architectures based on Convolutional Neural Networks (CNN) and Transformers have been extensively employed in medical image segmentation. However, previous studies have encountered difficulties in effectively combining global and local features or fully exploiting the rich context, leading to suboptimal segmentation. To address this shortcoming, this paper proposes the Parallel Contextual Convolutional Transformer (PCCTrans), whose encoder-decoder consists of Contextual Transformer & Convolution (CoT&Conv) and Fully Convolutional Transformer & Convolution (FCT&Conv) parallel hybrid modules. The proposed Multiscale Fusion Output (MSF) module and channel-attention skipping connection are utilized to obtain better segmentation performance. Specifically, PCCTrans follows a U-shaped encoder-decoder design with a shallow CoT block that harnesses the contextual information among the input keys to guide the learning of dynamic attention matrix, thereby enhancing the acquisition of global information. The deeper improved FCT block effectively understands the fine-grained nature of the segmentation task and captures long-term dependencies in the inputs. At the end of the decoder, we propose the MSF module, which fuses the features learned by the model to enhance segmentation. The experimental results demonstrate that PCCTrans outperforms existing state-of-the-art models in the Synapse Multi-Organ Segmentation and Automated Cardiac Diagnosis Challenge (ACDC) datasets without any pre-training. On the Dice metric, PCCTrans outperforms its direct competitors by 1.37% on the Synapse dataset and 0.66% on the ACDC dataset, with up to a threefold fewer parameters. It is worth mentioning that the method in this paper achieved superior evaluation metrics and segmentation results in four other colour RGB medical image datasets.
引用
收藏
页数:12
相关论文
共 62 条
[11]  
Codella NCF, 2018, I S BIOMED IMAGING, P168, DOI 10.1109/ISBI.2018.8363547
[12]  
Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
[13]  
Fang YX, 2021, ADV NEUR IN
[14]   SOTR: Segmenting Objects with Transformers [J].
Guo, Ruohao ;
Niu, Dantong ;
Qu, Liao ;
Li, Zhenbo .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :7137-7146
[15]   Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images [J].
Hatamizadeh, Ali ;
Nath, Vishwesh ;
Tang, Yucheng ;
Yang, Dong ;
Roth, Holger R. ;
Xu, Daguang .
BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2021, PT I, 2022, 12962 :272-284
[16]   UNETR: Transformers for 3D Medical Image Segmentation [J].
Hatamizadeh, Ali ;
Tang, Yucheng ;
Nath, Vishwesh ;
Yang, Dong ;
Myronenko, Andriy ;
Landman, Bennett ;
Roth, Holger R. ;
Xu, Daguang .
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, :1748-1758
[17]  
Hatamizadeh Ali, P MACHINE LEARNING R
[18]  
Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/CVPR.2018.00745, 10.1109/TPAMI.2019.2913372]
[19]  
Huang HM, 2020, INT CONF ACOUST SPEE, P1055, DOI [10.1109/ICASSP40776.2020.9053405, 10.1109/icassp40776.2020.9053405]
[20]  
Huang X., 2021, arXiv