Hierarchical Decoder with Parallel Transformer and CNN for Medical Image Segmentation

被引:0
作者
Li, Shijie [1 ]
Gong, Yu [1 ]
Xiang, Qingyuan [1 ]
Li, Zheng [1 ,2 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
[2] Sichuan Univ, Tianfu Engn Oriented Numercial Simulat & Software, Chengdu 610207, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XIV | 2025年 / 15044卷
基金
中国国家自然科学基金;
关键词
Medical image segmentation; Hierarchical decoder; Attention mechanism; PLUS PLUS;
D O I
10.1007/978-981-97-8496-7_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the success of Transformers, hybrid Transformer and CNN methods gain considerable popularity in medical image segmentation. These methods utilize a hybrid architecture that combines Transformers and CNNs to fuse global and local information, supplemented by a pyramid structure to facilitate multi-scale interaction. However, they encounter two primary limitations: (i) Transformer struggle to capture complete global information due to the sliding window nature of the convolutional operator, and (ii) the pyramid structure within single decoder fails to provide sufficient multi-scale interaction necessary for restoring detailed features at higher levels. In this paper, we introduce the Hierarchical Decoder with Parallel Transformer and CNN (HiPar), a novel architecture designed to address these limitations. Firstly, we present a parallel structure of Transformer and CNN to maximize the capture of both global and local features. Subsequently, we propose a hierarchical decoder to model multi-scale information and progressively restore spatial details. Additionally, we incorporate lightweight components to enhance the efficiency of feature representation. Extensive experiments demonstrate that our HiPar achieves state-of-the-art results on three popular medical image segmentation benchmarks: Synapse, ACDC and GlaS.
引用
收藏
页码:133 / 147
页数:15
相关论文
共 50 条
[21]   Boundary-guided feature integration network with hierarchical transformer for medical image segmentation [J].
Wang, Fan ;
Wang, Bo .
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (03) :8955-8969
[22]   LATrans-Unet: Improving CNN-Transformer with Location Adaptive for Medical Image Segmentation [J].
Lin, Qiqin ;
Yao, Junfeng ;
Hong, Qingqi ;
Cao, Xianpeng ;
Zhou, Rongzhou ;
Xie, Weixing .
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XIII, 2024, 14437 :223-234
[23]   ScribFormer: Transformer Makes CNN Work Better for Scribble-Based Medical Image Segmentation [J].
Li, Zihan ;
Zheng, Yuan ;
Shan, Dandan ;
Yang, Shuzhou ;
Li, Qingde ;
Wang, Beizhan ;
Zhang, Yuanting ;
Hong, Qingqi ;
Shen, Dinggang .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2024, 43 (06) :2254-2265
[24]   Aggregated Mutual Learning between CNN and Transformer for semi-supervised medical image segmentation [J].
Xu, Zhenghua ;
Wang, Hening ;
Yang, Runhe ;
Yang, Yuchen ;
Liu, Weipeng ;
Lukasiewicz, Thomas .
KNOWLEDGE-BASED SYSTEMS, 2025, 311
[25]   UCTNet: Uncertainty-guided CNN-Transformer hybrid networks for medical image segmentation [J].
Guo, Xiayu ;
Lin, Xian ;
Yang, Xin ;
Yu, Li ;
Cheng, Kwang-Ting ;
Yan, Zengqiang .
PATTERN RECOGNITION, 2024, 152
[26]   RAMIS: Increasing robustness and accuracy in medical image segmentation with hybrid CNN-transformer synergy [J].
Gu, Jia ;
Tian, Fangzheng ;
Oh, Il-Seok .
NEUROCOMPUTING, 2025, 618
[27]   DEEP FUSION OF SHIFTED MLP AND CNN FOR MEDICAL IMAGE SEGMENTATION [J].
Yuan, Chengyu ;
Xiong, Hao ;
Shangguan, Guoqing ;
Shen, Hualei ;
Liu, Dong ;
Zhang, Haojie ;
Liu, Zhonghua ;
Qian, Kun ;
Hu, Bin ;
Schuller, Bjoern W. ;
Yamamoto, Yoshiharu ;
Berkovsky, Shlomo .
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, :1676-1680
[28]   TransUMobileNet: Integrating multi-channel attention fusion with hybrid CNN-Transformer architecture for medical image segmentation [J].
Cai, Sijing ;
Jiang, Yukun ;
Xiao, Yuwei ;
Zeng, Jian ;
Zhou, Guangming .
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 107
[29]   APT-Net: Adaptive encoding and parallel decoding transformer for medical image segmentation [J].
Zhang, Ning ;
Yu, Long ;
Zhang, Dezhi ;
Wu, Weidong ;
Tian, Shengwei ;
Kang, Xiaojing .
COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 151
[30]   Medical Image Segmentation Using Transformer Networks [J].
Karimi, Davood ;
Dou, Haoran ;
Gholipour, Ali .
IEEE ACCESS, 2022, 10 :29322-29332