An effective CNN and Transformer complementary network for medical image segmentation

被引:240
作者
Yuan, Feiniu [1 ,3 ,4 ]
Zhang, Zhengxiao [1 ,3 ,4 ]
Fang, Zhijun [2 ]
机构
[1] Shanghai Normal Univ SHNU, Coll Informat Mech & Elect Engn, Shanghai 201418, Peoples R China
[2] Donghua Univ, Sch Comp Sci & Technol, Shanghai 201620, Peoples R China
[3] Shanghai Normal Univ, Res Base Online Educ Shanghai Middle & Primary Sch, Shanghai 201418, Peoples R China
[4] Shanghai Normal Univ, Shanghai Engn Res Ctr Intelligent Educ & Bigdata, Shanghai 200234, Peoples R China
基金
中国国家自然科学基金;
关键词
Transformer; Medical image segmentation; Feature complementary module; Cross -domain fusion; Convolutional Neural Network; ATTENTION;
D O I
10.1016/j.patcog.2022.109228
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Transformer network was originally proposed for natural language processing. Due to its powerful representation ability for long-range dependency, it has been extended for vision tasks in recent years. To fully utilize the advantages of Transformers and Convolutional Neural Networks (CNNs), we propose a CNN and Transformer Complementary Network (CTC -Net) for medical image segmentation. We first de-sign two encoders by Swin Transformers and Residual CNNs to produce complementary features in Trans-former and CNN domains, respectively. Then we cross-wisely concatenate these complementary features to propose a Cross-domain Fusion Block (CFB) for effectively blending them. In addition, we compute the correlation between features from the CNN and Transformer domains, and apply channel attention to the self-attention features by Transformers for capturing dual attention information. We incorporate cross-domain fusion, feature correlation and dual attention together to propose a Feature Complementary Module (FCM) for improving the representation ability of features. Finally, we design a Swin Transformer decoder to further improve the representation ability of long-range dependencies, and propose to use skip connections between the Transformer decoded features and the complementary features for extract-ing spatial details, contextual semantics and long-range information. Skip connections are performed in different levels for enhancing multi-scale invariance. Experimental results show that our CTC -Net signifi-cantly surpasses the state-of-the-art image segmentation models based on CNNs, Transformers, and even Transformer and CNN combined models designed for medical image segmentation. It achieves superior performance on different medical applications, including multi-organ segmentation and cardiac segmen-tation. (c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] UCTNet: Uncertainty-guided CNN-Transformer hybrid networks for medical image segmentation
    Guo, Xiayu
    Lin, Xian
    Yang, Xin
    Yu, Li
    Cheng, Kwang-Ting
    Yan, Zengqiang
    PATTERN RECOGNITION, 2024, 152
  • [32] Hybrid transformer-CNN with boundary-awareness network for 3D medical image segmentation
    He, Jianfei
    Xu, Canhui
    APPLIED INTELLIGENCE, 2023, 53 (23) : 28542 - 28554
  • [33] LW-CTrans: A lightweight hybrid network of CNN and Transformer for 3D medical image segmentation
    Kuang, Hulin
    Wang, Yahui
    Tana, Xianzhen
    Yang, Jialin
    Sun, Jiarui
    Liu, Jin
    Qiu, Wu
    Zhang, Jingyang
    Zhang, Jiulou
    Yang, Chunfeng
    Wang, Jianxin
    Chen, Yang
    MEDICAL IMAGE ANALYSIS, 2025, 102
  • [34] A dual-branch and dual attention transformer and CNN hybrid network for ultrasound image segmentation
    Zhang, Chong
    Wang, Lingtong
    Wei, Guohui
    Kong, Zhiyong
    Qiu, Min
    FRONTIERS IN PHYSIOLOGY, 2024, 15
  • [35] Semhybridnet: a semantically enhanced hybrid CNN-transformer network for radar pulse image segmentation
    Liu, Hongjia
    Xiao, Yubin
    Wu, Xuan
    Li, Yuanshu
    Zhao, Peng
    Liang, Yanchun
    Wang, Liupu
    Zhou, You
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (02) : 2851 - 2868
  • [36] TransGraphNet: A novel network for medical image segmentation based on transformer and graph convolution
    Zhang, Ju
    Ye, Zhiyi
    Chen, Mingyang
    Yu, Jiahao
    Cheng, Yun
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 104
  • [37] LET-Net: locally enhanced transformer network for medical image segmentation
    Ta, Na
    Chen, Haipeng
    Liu, Xianzhu
    Jin, Nuo
    MULTIMEDIA SYSTEMS, 2023, 29 (06) : 3847 - 3861
  • [38] Hybrid transformer-CNN with boundary-awareness network for 3D medical image segmentation
    Jianfei He
    Canhui Xu
    Applied Intelligence, 2023, 53 : 28542 - 28554
  • [39] LET-Net: locally enhanced transformer network for medical image segmentation
    Na Ta
    Haipeng Chen
    Xianzhu Liu
    Nuo Jin
    Multimedia Systems, 2023, 29 (6) : 3847 - 3861
  • [40] Semhybridnet: a semantically enhanced hybrid CNN-transformer network for radar pulse image segmentation
    Hongjia Liu
    Yubin Xiao
    Xuan Wu
    Yuanshu Li
    Peng Zhao
    Yanchun Liang
    Liupu Wang
    You Zhou
    Complex & Intelligent Systems, 2024, 10 : 2851 - 2868