Improved UNet with Attention for Medical Image Segmentation

被引:22
作者
AL Qurri, Ahmed [1 ]
Almekkawy, Mohamed [1 ]
机构
[1] Penn State Univ, Sch Elect Engn & Comp Sci, University Pk, PA 16802 USA
关键词
UNet; UNet plus plus; Transformer; CNN; attention; medical imaging; ultrasound; CT scan; U-NET; PLUS PLUS; ARCHITECTURE; TRANSFORMER;
D O I
10.3390/s23208589
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Medical image segmentation is crucial for medical image processing and the development of computer-aided diagnostics. In recent years, deep Convolutional Neural Networks (CNNs) have been widely adopted for medical image segmentation and have achieved significant success. UNet, which is based on CNNs, is the mainstream method used for medical image segmentation. However, its performance suffers owing to its inability to capture long-range dependencies. Transformers were initially designed for Natural Language Processing (NLP), and sequence-to-sequence applications have demonstrated the ability to capture long-range dependencies. However, their abilities to acquire local information are limited. Hybrid architectures of CNNs and Transformer, such as TransUNet, have been proposed to benefit from Transformer's long-range dependencies and CNNs' low-level details. Nevertheless, automatic medical image segmentation remains a challenging task due to factors such as blurred boundaries, the low-contrast tissue environment, and in the context of ultrasound, issues like speckle noise and attenuation. In this paper, we propose a new model that combines the strengths of both CNNs and Transformer, with network architectural improvements designed to enrich the feature representation captured by the skip connections and the decoder. To this end, we devised a new attention module called Three-Level Attention (TLA). This module is composed of an Attention Gate (AG), channel attention, and spatial normalization mechanism. The AG preserves structural information, whereas channel attention helps to model the interdependencies between channels. Spatial normalization employs the spatial coefficient of the Transformer to improve spatial attention akin to TransNorm. To further improve the skip connection and reduce the semantic gap, skip connections between the encoder and decoder were redesigned in a manner similar to that of the UNet++ dense connection. Moreover, deep supervision using a side-output channel was introduced, analogous to BASNet, which was originally used for saliency predictions. Two datasets from different modalities, a CT scan dataset and an ultrasound dataset, were used to evaluate the proposed UNet architecture. The experimental results showed that our model consistently improved the prediction performance of the UNet across different datasets.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] DMFC-UFormer: Depthwise multi-scale factorized convolution transformer-based UNet for medical image segmentation
    Garbaz, Anass
    Oukdach, Yassine
    Charfi, Said
    El Ansari, Mohamed
    Koutti, Lahcen
    Salihoun, Mouna
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 101
  • [22] A Comprehensive Exploration of L-UNet Approach: Revolutionizing Medical Image Segmentation
    Alafer, Feras
    Hameed Siddiqi, Muhammad
    Sheraz Khan, Muhammad
    Ahmad, Irshad
    Alhujaili, Sultan
    Alrowaili, Ziyad
    Saad Alshabibi, Abdulaziz
    IEEE ACCESS, 2024, 12 : 140769 - 140791
  • [23] VM-UNet plus plus research on crack image segmentation based on improved VM-UNet
    Tang, Wenliang
    Wu, Ziyi
    Wang, Wei
    Pan, Youqin
    Gan, Weihua
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [24] ERDUnet: An Efficient Residual Double-Coding Unet for Medical Image Segmentation
    Li, Hao
    Zhai, Di-Hua
    Xia, Yuanqing
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2083 - 2096
  • [25] Multimodal parallel attention network for medical image segmentation
    Wang, Zhibing
    Wang, Wenmin
    Li, Nannan
    Zhang, Shenyong
    Chen, Qi
    Jiang, Zhe
    IMAGE AND VISION COMPUTING, 2024, 147
  • [26] NFMPAtt-Unet: Neighborhood Fuzzy C-means Multi-scale Pyramid Hybrid Attention Unet for medical image segmentation
    Zhao, Xinpeng
    Xu, Weihua
    NEURAL NETWORKS, 2024, 178
  • [27] MDA-Unet: A Multi-Scale Dilated Attention U-Net for Medical Image Segmentation
    Amer, Alyaa
    Lambrou, Tryphon
    Ye, Xujiong
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [28] Advantages of transformer and its application for medical image segmentation: a survey
    Pu, Qiumei
    Xi, Zuoxin
    Yin, Shuai
    Zhao, Zhe
    Zhao, Lina
    BIOMEDICAL ENGINEERING ONLINE, 2024, 23 (01)
  • [29] VM-UNET-V2: Rethinking Vision Mamba UNet for Medical Image Segmentation
    Zhang, Mingya
    Yu, Yue
    Jin, Sun
    Gu, Limei
    Ling, Tingsheng
    Tao, Xianping
    BIOINFORMATICS RESEARCH AND APPLICATIONS, PT I, ISBRA 2024, 2024, 14954 : 335 - 346
  • [30] Ultrasound Image Segmentation using a Model of Transformer and DFT
    Al-Qurri, Ahmed
    Almekkawy, Mohamed
    2024 IEEE UFFC LATIN AMERICA ULTRASONICS SYMPOSIUM, LAUS, 2024,