Permutation invariant self-attention infused U-shaped transformer for medical image segmentation

被引:0
作者
Patil, Sanjeet S. [1 ]
Ramteke, Manojkumar [1 ,2 ]
Rathore, Anurag S. [1 ,2 ]
机构
[1] Indian Inst Technol Delhi, Dept Chem Engn, New Delhi 110016, India
[2] Indian Inst Technol Delhi, Yardi Sch Artificial Intelligence, New Delhi 110016, India
关键词
Permutation invariant attention; Transformers; Segmentation; Medical image analysis; Cardiac MRI; Abdominal CT;
D O I
10.1016/j.neucom.2025.129577
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The size and shape of organs in the human body vary according to factors like genetics, body size, proportions, health, lifestyle, gender, ethnicity, and race. Further, abnormalities due to cancer and chronic diseases also affect the size of organs and tumors. Moreover, the spatial location and area of these organs deviates along the transverse plane (Z plane) of the medical scans. Therefore, the generalizability and robustness of a computer vision framework over medical images can be improved if the framework is also encouraged to learn representations of the target areas regardless of their spatial location in input images. Hence, we propose a novel permutation invariant multi-headed self-attention (PISA) module to reduce a U-shaped transformer-based architecture Swin-UNet's sensitivity towards permutation. We have infused this module in the skip connection of our architecture. We have achieved a mean dice score of 79.25 on the segmentations of 8 abdominal organs, better than most state-of-the-art algorithms. Moreover, we have analyzed the generalizability of our architecture over publicly available multi-sequence cardiac MRI datasets. When tested over a sequence unseen by the model during training, 25.1 % and 9.0 % improvement in dice scores were observed in comparison to the pure-CNN- based algorithm and pure transformer-based architecture, respectively, thereby demonstrating its versatility. Replacing the Self Attention module in a U-shaped transformer architecture with our Permutation Invariant Self Attention module produced noteworthy segmentations over shuffled test images, even though the module was trained solely on normal images. The results demonstrate the enhanced efficiency of the proposed module in imparting attention to target organs irrespective of their spatial positions.
引用
收藏
页数:13
相关论文
共 47 条
  • [31] CCT-Unet: A U-Shaped Network Based on Convolution Coupled Transformer for Segmentation of Peripheral and Transition Zones in Prostate MRI
    Yan, Yifei
    Liu, Rongzong
    Chen, Haobo
    Zhang, Limin
    Zhang, Qi
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (09) : 4341 - 4351
  • [32] DS-TransUNet: Dual Swin Transformer U-Net for Medical Image Segmentation
    Lin, Ailiang
    Chen, Bingzhi
    Xu, Jiayu
    Zhang, Zheng
    Lu, Guangming
    Zhang, David
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [33] HCA-former: Hybrid Convolution Attention Transformer for 3D Medical Image Segmentation
    Yang, Fan
    Wang, Fan
    Dong, Pengwei
    Wang, Bo
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 90
  • [34] MULTI-TASK LEARNING WITH CONTEXT-ORIENTED SELF-ATTENTION FOR BREAST ULTRASOUND IMAGE CLASSIFICATION AND SEGMENTATION
    Xu, Meng
    Huang, Kuan
    Qi, Xiaojun
    2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
  • [35] TPAFNet: Transformer-Driven Pyramid Attention Fusion Network for 3D Medical Image Segmentation
    Li, Zheng
    Zhang, Jinhui
    Wei, Siyi
    Gao, Yueyang
    Cao, Chengwei
    Wu, Zhiwei
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (11) : 6803 - 6814
  • [36] Enhanced medical image segmentation using U-Net with residual connections and dual attention mechanism
    Xiao, Leyi
    Song, Jiaojiao
    Xie, Xia
    Fan, Chaodong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 153
  • [37] Hepatic vessel segmentation based on 3D swin-transformer with inductive biased multi-head self-attention
    Mian Wu
    Yinling Qian
    Xiangyun Liao
    Qiong Wang
    Pheng-Ann Heng
    BMC Medical Imaging, 23
  • [38] DRA U-Net: An Attention based U-Net Framework for 2D Medical Image Segmentation
    Zhang, Xian
    Feng, Ziyuan
    Zhong, Tianchi
    Shen, Sicheng
    Zhang, Ruolin
    Zhou, Lijie
    Zhang, Bo
    Wang, Wendong
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 3936 - 3942
  • [39] Hepatic vessel segmentation based on 3D swin-transformer with inductive biased multi-head self-attention
    Wu, Mian
    Qian, Yinling
    Liao, Xiangyun
    Wang, Qiong
    Heng, Pheng-Ann
    BMC MEDICAL IMAGING, 2023, 23 (01)
  • [40] RSU-Net: U-net based on residual and self-attention mechanism in the segmentation of cardiac magnetic resonance images
    Li, Yuan-Zhe
    Wang, Yi
    Huang, Yin-Hui
    Xiang, Ping
    Liu, Wen-Xi
    Lai, Qing-Quan
    Gao, Yi-Yuan
    Xu, Mao-Sheng
    Guo, Yi-Fan
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2023, 231