A parallelly contextual convolutional transformer for medical image segmentation

被引:7
作者
Feng, Yuncong [1 ,2 ,3 ]
Su, Jianyu [1 ]
Zheng, Jian [4 ]
Zheng, Yupeng [5 ]
Zhang, Xiaoli [3 ,6 ]
机构
[1] Changchun Univ Technol, Coll Comp Sci & Engn, Changchun 130012, Jilin, Peoples R China
[2] Changchun Univ Technol, Artificial Intelligence Res Inst, Changchun 130012, Jilin, Peoples R China
[3] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Jilin, Peoples R China
[4] Southeast Univ, Natl Mobile Commun Res Lab, Nanjing 211189, Jiangsu, Peoples R China
[5] Lihuili Hosp, Ningbo Med Ctr, Dept Colon Anorectal Surg, Ningbo 315100, Zhejiang, Peoples R China
[6] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Jilin, Peoples R China
关键词
Medical image segmentation; Deep learning; Transformer; Convolutional neural networks; Parallel architecture; U-NET;
D O I
10.1016/j.bspc.2024.106674
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Hybrid architectures based on Convolutional Neural Networks (CNN) and Transformers have been extensively employed in medical image segmentation. However, previous studies have encountered difficulties in effectively combining global and local features or fully exploiting the rich context, leading to suboptimal segmentation. To address this shortcoming, this paper proposes the Parallel Contextual Convolutional Transformer (PCCTrans), whose encoder-decoder consists of Contextual Transformer & Convolution (CoT&Conv) and Fully Convolutional Transformer & Convolution (FCT&Conv) parallel hybrid modules. The proposed Multiscale Fusion Output (MSF) module and channel-attention skipping connection are utilized to obtain better segmentation performance. Specifically, PCCTrans follows a U-shaped encoder-decoder design with a shallow CoT block that harnesses the contextual information among the input keys to guide the learning of dynamic attention matrix, thereby enhancing the acquisition of global information. The deeper improved FCT block effectively understands the fine-grained nature of the segmentation task and captures long-term dependencies in the inputs. At the end of the decoder, we propose the MSF module, which fuses the features learned by the model to enhance segmentation. The experimental results demonstrate that PCCTrans outperforms existing state-of-the-art models in the Synapse Multi-Organ Segmentation and Automated Cardiac Diagnosis Challenge (ACDC) datasets without any pre-training. On the Dice metric, PCCTrans outperforms its direct competitors by 1.37% on the Synapse dataset and 0.66% on the ACDC dataset, with up to a threefold fewer parameters. It is worth mentioning that the method in this paper achieved superior evaluation metrics and segmentation results in four other colour RGB medical image datasets.
引用
收藏
页数:12
相关论文
共 62 条
[1]   A Multi-Scale Context Aware Attention Model for Medical Image Segmentation [J].
Alam, Md. Shariful ;
Wang, Dadong ;
Liao, Qiyu ;
Sowmya, Arcot .
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (08) :3731-3739
[2]   WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians [J].
Bernal, Jorge ;
Javier Sanchez, F. ;
Fernandez-Esparrach, Gloria ;
Gil, Debora ;
Rodriguez, Cristina ;
Vilarino, Fernando .
COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 43 :99-111
[3]   Deep Learning Techniques for Automatic MRI Cardiac Multi-Structures Segmentation and Diagnosis: Is the Problem Solved? [J].
Bernard, Olivier ;
Lalande, Alain ;
Zotti, Clement ;
Cervenansky, Frederick ;
Yang, Xin ;
Heng, Pheng-Ann ;
Cetin, Irem ;
Lekadir, Karim ;
Camara, Oscar ;
Gonzalez Ballester, Miguel Angel ;
Sanroma, Gerard ;
Napel, Sandy ;
Petersen, Steffen ;
Tziritas, Georgios ;
Grinias, Elias ;
Khened, Mahendra ;
Kollerathu, Varghese Alex ;
Krishnamurthi, Ganapathy ;
Rohe, Marc-Michel ;
Pennec, Xavier ;
Sermesant, Maxime ;
Isensee, Fabian ;
Jaeger, Paul ;
Maier-Hein, Klaus H. ;
Full, Peter M. ;
Wolf, Ivo ;
Engelhardt, Sandy ;
Baumgartner, Christian F. ;
Koch, Lisa M. ;
Wolterink, Jelmer M. ;
Isgum, Ivana ;
Jang, Yeonggul ;
Hong, Yoonmi ;
Patravali, Jay ;
Jain, Shubham ;
Humbert, Olivier ;
Jodoin, Pierre-Marc .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2018, 37 (11) :2514-2525
[5]  
Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9
[6]  
Castleman K.R., 1996, Digital image processing
[7]   CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification [J].
Chen, Chun-Fu ;
Fan, Quanfu ;
Panda, Rameswar .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :347-356
[8]  
Chen J., 2021, arXiv, DOI DOI 10.48550/ARXIV.2102.04306
[9]  
Cicek Ozgun, 2016, Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016. 19th International Conference. Proceedings: LNCS 9901, P424, DOI 10.1007/978-3-319-46723-8_49
[10]  
Codella N, 2019, Arxiv, DOI arXiv:1902.03368