Permutation invariant self-attention infused U-shaped transformer for medical image segmentation

被引:0
|
作者
Patil, Sanjeet S. [1 ]
Ramteke, Manojkumar [1 ,2 ]
Rathore, Anurag S. [1 ,2 ]
机构
[1] Indian Inst Technol Delhi, Dept Chem Engn, New Delhi 110016, India
[2] Indian Inst Technol Delhi, Yardi Sch Artificial Intelligence, New Delhi 110016, India
关键词
Permutation invariant attention; Transformers; Segmentation; Medical image analysis; Cardiac MRI; Abdominal CT;
D O I
10.1016/j.neucom.2025.129577
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The size and shape of organs in the human body vary according to factors like genetics, body size, proportions, health, lifestyle, gender, ethnicity, and race. Further, abnormalities due to cancer and chronic diseases also affect the size of organs and tumors. Moreover, the spatial location and area of these organs deviates along the transverse plane (Z plane) of the medical scans. Therefore, the generalizability and robustness of a computer vision framework over medical images can be improved if the framework is also encouraged to learn representations of the target areas regardless of their spatial location in input images. Hence, we propose a novel permutation invariant multi-headed self-attention (PISA) module to reduce a U-shaped transformer-based architecture Swin-UNet's sensitivity towards permutation. We have infused this module in the skip connection of our architecture. We have achieved a mean dice score of 79.25 on the segmentations of 8 abdominal organs, better than most state-of-the-art algorithms. Moreover, we have analyzed the generalizability of our architecture over publicly available multi-sequence cardiac MRI datasets. When tested over a sequence unseen by the model during training, 25.1 % and 9.0 % improvement in dice scores were observed in comparison to the pure-CNN- based algorithm and pure transformer-based architecture, respectively, thereby demonstrating its versatility. Replacing the Self Attention module in a U-shaped transformer architecture with our Permutation Invariant Self Attention module produced noteworthy segmentations over shuffled test images, even though the module was trained solely on normal images. The results demonstrate the enhanced efficiency of the proposed module in imparting attention to target organs irrespective of their spatial positions.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Uformer: A General U-Shaped Transformer for Image Restoration
    Wang, Zhendong
    Cun, Xiaodong
    Bao, Jianmin
    Zhou, Wengang
    Liu, Jianzhuang
    Li, Houqiang
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17662 - 17672
  • [22] MSCT-UNET: multi-scale contrastive transformer within U-shaped network for medical image segmentation
    Xi, Heran
    Dong, Haoji
    Sheng, Yue
    Cui, Hui
    Huang, Chengying
    Li, Jinbao
    Zhu, Jinghua
    PHYSICS IN MEDICINE AND BIOLOGY, 2024, 69 (01):
  • [23] GSAC-UFormer: Groupwise Self-Attention Convolutional Transformer-Based UNet for Medical Image Segmentation
    Garbaz, Anass
    Oukdach, Yassine
    Charfi, Said
    El Ansari, Mohamed
    Koutti, Lahcen
    Salihoun, Mouna
    COGNITIVE COMPUTATION, 2025, 17 (02)
  • [24] RockFormer: A U-Shaped Transformer Network for Martian Rock Segmentation
    Liu, Haiqiang
    Yao, Meibao
    Xiao, Xueming
    Xiong, Yonggang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [25] MP-FocalUNet: Multiscale parallel focal self-attention U-Net for medical image segmentation
    Wang, Chuan
    Jiang, Mingfeng
    Li, Yang
    Wei, Bo
    Li, Yongming
    Wang, Pin
    Yang, Guang
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2025, 260
  • [26] Attention U-shaped network for hyperspectral image classification
    Wang, Ruirui
    Liu, Bing
    Yu, Anzhu
    Wang, Wenjie
    Jiao, Xuejun
    JOURNAL OF APPLIED REMOTE SENSING, 2022, 16 (03)
  • [27] Sparse Coding Inspired LSTM and Self-Attention Integration for Medical Image Segmentation
    Ji, Zexuan
    Ye, Shunlong
    Ma, Xiao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 6098 - 6113
  • [28] TU-Net: U-shaped Structure Based on Transformers for Medical Image Segmentation
    Zhao, Jiamei
    Wu, Dikang
    Wang, Zhifang
    DATA SCIENCE (ICPCSEE 2022), PT I, 2022, 1628 : 376 - 386
  • [29] DI-Unet: Dimensional interaction self-attention for medical image segmentation
    Wu, Yanlin
    Wang, Guanglei
    Wang, Zhongyang
    Wang, Hongrui
    Li, Yan
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 78
  • [30] Transformer with sparse self-attention mechanism for image captioning
    Wang, Duofeng
    Hu, Haifeng
    Chen, Dihu
    ELECTRONICS LETTERS, 2020, 56 (15) : 764 - +