Dual Channel-Spatial Self-Attention Transformer and CNN synergy network for 3D medical image segmentation

被引:0
|
作者
Yang, Fan [1 ]
Wang, Bo [1 ]
机构
[1] Ningxia Univ, Sch Elect & Elect Engn, Yinchuan 750021, Peoples R China
基金
中国国家自然科学基金;
关键词
Convolutional neural layers; Attention collapse; Self-attention mechanism; Transformers; 3D medical image segmentation;
D O I
10.1016/j.asoc.2024.112255
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Even though the Vision Transformer leverages the self-attention mechanism to capture long-range dependencies, showing significant potential in medical image segmentation, the limited annotations in the image dataset make it difficult for the Transformer model to extract different global features, resulting in attention collapse and generating similar or identical attention maps. Previous studies have attempted to solve the problem by integrating convolutional neural layers into Transformer-based architectures. However, improper integration may lead to the inability of the model to effectively capture local and global information in both spatial and channel dimensions. To address the above issue, we propose a hybrid architecture using the Dual Channel-Spatial SelfAttention Transformer and CNN Synergy Network (DTC-SUNETR) for medical image segmentation. Specifically, we redesigned the self-attention mechanism. A novel Channel-Spatial Self-Attention (CSSA) block is introduced to integrate the enhanced channel and spatial self-attention mechanism to capture the global relationship and local structure among image features. This helps the model to more comprehensively understand the interdependencies between different channels and capture the relationships between different pixels, thus enhancing the feature representation of the corresponding dimensions. Simultaneously, it also improves the overall computational efficiency of the network. Extensive experiments on four different medical image segmentation datasets, including Synapse, ACDC, Brain Tumor, and Lung Tumor, demonstrate the superiority of the proposed DTC-SUNETR over state-of-the-art methods.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] CNN-TRANSFORMER WITH SELF-ATTENTION NETWORK FOR SOUND EVENT DETECTION
    Wakayama, Keigo
    Saito, Shoichiro
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 806 - 810
  • [22] SSA-PointNet++: A Space Self-Attention CNN for the Semantic Segmentation of 3D Point Cloud
    Wu, Jun
    Cui, Yue
    Zhao, Xuemei
    Chen, Ruixing
    Xu, Gang
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2022, 34 (03): : 437 - 448
  • [23] Permutation invariant self-attention infused U-shaped transformer for medical image segmentation
    Patil, Sanjeet S.
    Ramteke, Manojkumar
    Rathore, Anurag S.
    NEUROCOMPUTING, 2025, 625
  • [24] An effective CNN and Transformer complementary network for medical image segmentation
    Yuan, Feiniu
    Zhang, Zhengxiao
    Fang, Zhijun
    PATTERN RECOGNITION, 2023, 136
  • [25] Transformer based on channel-spatial attention for accurate classification of scenes in remote sensing image
    Jingxia Guo
    Nan Jia
    Jinniu Bai
    Scientific Reports, 12
  • [26] SACA-UNet:Medical Image Segmentation Network Based on Self-Attention and ASPP
    Fan, Gaojuan
    Wang, Jie
    Zhang, Chongsheng
    2023 IEEE 36TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, CBMS, 2023, : 317 - 322
  • [27] DA-TransUNet: integrating spatial and channel dual attention with transformer U-net for medical image segmentation
    Sun, Guanqun
    Pan, Yizhi
    Kong, Weikun
    Xu, Zichang
    Ma, Jianhua
    Racharak, Teeradaj
    Nguyen, Le-Minh
    Xin, Junyi
    FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2024, 12
  • [28] D-SAT: dual semantic aggregation transformer with dual attention for medical image segmentation
    Li, Haiyan
    Ding, Jiayu
    Shi, Xin
    Zhang, Qi
    Yu, Pengfei
    Li, Hongsong
    PHYSICS IN MEDICINE AND BIOLOGY, 2024, 69 (01):
  • [29] A Dual Self-Attention based Network for Image Captioning
    Li, ZhiYong
    Yang, JinFu
    Li, YaPing
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 1590 - 1595
  • [30] Transformer based on channel-spatial attention for accurate classification of scenes in remote sensing image
    Guo, Jingxia
    Jia, Nan
    Bai, Jinniu
    SCIENTIFIC REPORTS, 2022, 12 (01)