GSAC-UFormer: Groupwise Self-Attention Convolutional Transformer-Based UNet for Medical Image Segmentation

被引:3
作者
Garbaz, Anass [1 ]
Oukdach, Yassine [1 ]
Charfi, Said [1 ]
El Ansari, Mohamed [2 ]
Koutti, Lahcen [1 ]
Salihoun, Mouna [3 ]
机构
[1] Ibn Zohr Univ, Fac Sci, Dept Comp Sci, LabSIV, Agadir, Morocco
[2] My Ismail Univ, Fac Sci, Dept Comp Sci, Informat & Applicat Lab, Meknes, Morocco
[3] Mohammed V Univ, Fac Med & Pharm, Rabat 10100, Morocco
关键词
GSAC-UFormer; Transformer; UNet; Self-attention; Groupwise convolution; Medical image segmentation; SKIN-LESION SEGMENTATION; SEMANTIC SEGMENTATION; PLUS PLUS; NETWORK; NET;
D O I
10.1007/s12559-025-10425-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional transformers struggle to effectively capture local contextual information. Conversely, CNNs face challenges in modeling long-range dependencies. To address these limitations, this paper introduces GSAC-UFormer, an innovative Groupwise Self-Attention Convolutional Transformer-based UNet for medical image segmentation. The design of GSAC-UFormer focuses on efficiently integrating both local and global information, balancing the strengths of different processing techniques. At the core of GSAC-UFormer is the GSAC-Former block. This module combines groupwise convolution with a CNN-adaptive self-attention mechanism, enabling parallel integration of local and global contexts. This architecture allows the model to effectively capture intricate dependencies across various data dimensions while processing local features with high efficiency. The Guided Contextual Feature Attention (GCFA) mechanism further enhances feature selection. It emphasizes the most relevant contextual information, refining spatial and channel-wise relationships in the extracted features. This targeted approach mitigates noise and improves model accuracy. Additionally, the Multi-Depth Partitioned Depthwise Convolution Transformer (MDPDC-Former) serves as a bottleneck module. It optimizes feature mapping and enhances network learning efficiency by dynamically adjusting the receptive field. This enables the model to capture multi-scale semantic information more effectively. Experimental results highlight the superior performance of GSAC-UFormer compared to state-of-the-art methods. It achieves Dice coefficients of 91.6%, 94.61%, and 82.24% on the MICCAI 2017 (red lesion), PH2, and CVC-ClinicalDB datasets, respectively. These results underscore its effectiveness in advancing medical image segmentation.
引用
收藏
页数:14
相关论文
共 46 条
[1]   Semi-Supervised Skin Lesion Segmentation With Coupling CNN and Transformer Features [J].
Alahmadi, Mohammad D. D. ;
Alghamdi, Wajdi .
IEEE ACCESS, 2022, 10 :122560-122569
[2]   Harbor seal whiskers optimization algorithm with deep learning-based medical imaging analysis for gastrointestinal cancer detection [J].
Alshardan, Amal ;
Saeed, Muhammad Kashif ;
Alotaibi, Shoayee Dlaim ;
Alashjaee, Abdullah M. ;
Salih, Nahla ;
Marzouk, Radwa .
HEALTH INFORMATION SCIENCE AND SYSTEMS, 2024, 12 (01)
[3]  
Amiri Z, 2019, EUR W VIS INF PROCES, P217, DOI 10.1109/EUVIP47703.2019.8946168
[4]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[5]   WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians [J].
Bernal, Jorge ;
Javier Sanchez, F. ;
Fernandez-Esparrach, Gloria ;
Gil, Debora ;
Rodriguez, Cristina ;
Vilarino, Fernando .
COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 43 :99-111
[6]  
Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9
[7]  
Chen J., 2021, PREPRINT
[8]   A Deep Learning Approach for Red Lesions Detection in Video Capsule Endoscopies [J].
Coelho, Paulo ;
Pereira, Ana ;
Leite, Argentina ;
Salgado, Marta ;
Cunha, Antonio .
IMAGE ANALYSIS AND RECOGNITION (ICIAR 2018), 2018, 10882 :553-561
[9]   ResUNet-a: A deep learning framework for semantic segmentation of remotely sensed data [J].
Diakogiannis, Foivos, I ;
Waldner, Francois ;
Caccetta, Peter ;
Wu, Chen .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2020, 162 :94-114
[10]   A parallelly contextual convolutional transformer for medical image segmentation [J].
Feng, Yuncong ;
Su, Jianyu ;
Zheng, Jian ;
Zheng, Yupeng ;
Zhang, Xiaoli .
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 98