An effective CNN and Transformer complementary network for medical image segmentation

被引：240

作者：

Yuan, Feiniu ^{[1
,3
,4
]}

Zhang, Zhengxiao ^{[1
,3
,4
]}

Fang, Zhijun ^{[2
]}

机构：

[1] Shanghai Normal Univ SHNU, Coll Informat Mech & Elect Engn, Shanghai 201418, Peoples R China

[2] Donghua Univ, Sch Comp Sci & Technol, Shanghai 201620, Peoples R China

[3] Shanghai Normal Univ, Res Base Online Educ Shanghai Middle & Primary Sch, Shanghai 201418, Peoples R China

[4] Shanghai Normal Univ, Shanghai Engn Res Ctr Intelligent Educ & Bigdata, Shanghai 200234, Peoples R China

来源：

PATTERN RECOGNITION | 2023年 / 136卷

基金：

中国国家自然科学基金;

关键词：

Transformer; Medical image segmentation; Feature complementary module; Cross -domain fusion; Convolutional Neural Network; ATTENTION;

D O I：

10.1016/j.patcog.2022.109228

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Transformer network was originally proposed for natural language processing. Due to its powerful representation ability for long-range dependency, it has been extended for vision tasks in recent years. To fully utilize the advantages of Transformers and Convolutional Neural Networks (CNNs), we propose a CNN and Transformer Complementary Network (CTC -Net) for medical image segmentation. We first de-sign two encoders by Swin Transformers and Residual CNNs to produce complementary features in Trans-former and CNN domains, respectively. Then we cross-wisely concatenate these complementary features to propose a Cross-domain Fusion Block (CFB) for effectively blending them. In addition, we compute the correlation between features from the CNN and Transformer domains, and apply channel attention to the self-attention features by Transformers for capturing dual attention information. We incorporate cross-domain fusion, feature correlation and dual attention together to propose a Feature Complementary Module (FCM) for improving the representation ability of features. Finally, we design a Swin Transformer decoder to further improve the representation ability of long-range dependencies, and propose to use skip connections between the Transformer decoded features and the complementary features for extract-ing spatial details, contextual semantics and long-range information. Skip connections are performed in different levels for enhancing multi-scale invariance. Experimental results show that our CTC -Net signifi-cantly surpasses the state-of-the-art image segmentation models based on CNNs, Transformers, and even Transformer and CNN combined models designed for medical image segmentation. It achieves superior performance on different medical applications, including multi-organ segmentation and cardiac segmen-tation. (c) 2022 Elsevier Ltd. All rights reserved.

引用

页数：12

共 50 条

[31] UCTNet: Uncertainty-guided CNN-Transformer hybrid networks for medical image segmentation
Guo, Xiayu
Lin, Xian
Yang, Xin
Yu, Li
Cheng, Kwang-Ting
Yan, Zengqiang
PATTERN RECOGNITION, 2024, 152
[32] Hybrid transformer-CNN with boundary-awareness network for 3D medical image segmentation
He, Jianfei
Xu, Canhui
APPLIED INTELLIGENCE, 2023, 53 (23) : 28542 - 28554
[33] LW-CTrans: A lightweight hybrid network of CNN and Transformer for 3D medical image segmentation
Kuang, Hulin
Wang, Yahui
Tana, Xianzhen
Yang, Jialin
Sun, Jiarui
Liu, Jin
Qiu, Wu
Zhang, Jingyang
Zhang, Jiulou
Yang, Chunfeng
Wang, Jianxin
Chen, Yang
MEDICAL IMAGE ANALYSIS, 2025, 102
[34] A dual-branch and dual attention transformer and CNN hybrid network for ultrasound image segmentation
Zhang, Chong
Wang, Lingtong
Wei, Guohui
Kong, Zhiyong
Qiu, Min
FRONTIERS IN PHYSIOLOGY, 2024, 15
[35] Semhybridnet: a semantically enhanced hybrid CNN-transformer network for radar pulse image segmentation
Liu, Hongjia
Xiao, Yubin
Wu, Xuan
Li, Yuanshu
Zhao, Peng
Liang, Yanchun
Wang, Liupu
Zhou, You
COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (02) : 2851 - 2868
[36] TransGraphNet: A novel network for medical image segmentation based on transformer and graph convolution
Zhang, Ju
Ye, Zhiyi
Chen, Mingyang
Yu, Jiahao
Cheng, Yun
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 104
[37] LET-Net: locally enhanced transformer network for medical image segmentation
Ta, Na
Chen, Haipeng
Liu, Xianzhu
Jin, Nuo
MULTIMEDIA SYSTEMS, 2023, 29 (06) : 3847 - 3861
[38] Hybrid transformer-CNN with boundary-awareness network for 3D medical image segmentation
Jianfei He
Canhui Xu
Applied Intelligence, 2023, 53 : 28542 - 28554
[39] LET-Net: locally enhanced transformer network for medical image segmentation
Na Ta
Haipeng Chen
Xianzhu Liu
Nuo Jin
Multimedia Systems, 2023, 29 (6) : 3847 - 3861
[40] Semhybridnet: a semantically enhanced hybrid CNN-transformer network for radar pulse image segmentation
Hongjia Liu
Yubin Xiao
Xuan Wu
Yuanshu Li
Peng Zhao
Yanchun Liang
Liupu Wang
You Zhou
Complex & Intelligent Systems, 2024, 10 : 2851 - 2868

← 1 2 3 4 5 →