Permutation invariant self-attention infused U-shaped transformer for medical image segmentation

被引：0

作者：

Patil, Sanjeet S. ^{[1
]}

Ramteke, Manojkumar ^{[1
,2
]}

Rathore, Anurag S. ^{[1
,2
]}

机构：

[1] Indian Inst Technol Delhi, Dept Chem Engn, New Delhi 110016, India

[2] Indian Inst Technol Delhi, Yardi Sch Artificial Intelligence, New Delhi 110016, India

来源：

NEUROCOMPUTING | 2025年 / 625卷

关键词：

Permutation invariant attention; Transformers; Segmentation; Medical image analysis; Cardiac MRI; Abdominal CT;

D O I：

10.1016/j.neucom.2025.129577

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The size and shape of organs in the human body vary according to factors like genetics, body size, proportions, health, lifestyle, gender, ethnicity, and race. Further, abnormalities due to cancer and chronic diseases also affect the size of organs and tumors. Moreover, the spatial location and area of these organs deviates along the transverse plane (Z plane) of the medical scans. Therefore, the generalizability and robustness of a computer vision framework over medical images can be improved if the framework is also encouraged to learn representations of the target areas regardless of their spatial location in input images. Hence, we propose a novel permutation invariant multi-headed self-attention (PISA) module to reduce a U-shaped transformer-based architecture Swin-UNet's sensitivity towards permutation. We have infused this module in the skip connection of our architecture. We have achieved a mean dice score of 79.25 on the segmentations of 8 abdominal organs, better than most state-of-the-art algorithms. Moreover, we have analyzed the generalizability of our architecture over publicly available multi-sequence cardiac MRI datasets. When tested over a sequence unseen by the model during training, 25.1 % and 9.0 % improvement in dice scores were observed in comparison to the pure-CNN- based algorithm and pure transformer-based architecture, respectively, thereby demonstrating its versatility. Replacing the Self Attention module in a U-shaped transformer architecture with our Permutation Invariant Self Attention module produced noteworthy segmentations over shuffled test images, even though the module was trained solely on normal images. The results demonstrate the enhanced efficiency of the proposed module in imparting attention to target organs irrespective of their spatial positions.

引用

页数：13

共 47 条

[21] ECA-TFUnet: A U-shaped CNN-Transformer network with efficient channel attention for organ segmentation in anatomical sectional images of canines
Liu, Yunling
Liu, Yaxiong
Li, Jingsong
Chen, Yaoxing
Xu, Fengjuan
Xu, Yifa
Cao, Jing
Ma, Yuntao
MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (10) : 18650 - 18669
[22] Densely Connected Transformer With Linear Self-Attention for Lightweight Image Super-Resolution
Zeng, Kun
Lin, Hanjiang
Yan, Zhiqiang
Fang, Jinsheng
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[23] US-Net: U-shaped network with Convolutional Attention Mechanism for ultrasound medical images
Xie, Xiaoyu
Liu, Pingping
Lang, Yijun
Guo, Zhenjie
Yang, Zhongxi
Zhao, Yuhao
COMPUTERS & GRAPHICS-UK, 2024, 124
[24] CTDUNet: A Multimodal CNN-Transformer Dual U-Shaped Network with Coordinate Space Attention for Camellia oleifera Pests and Diseases Segmentation in Complex Environments
Guo, Ruitian
Zhang, Ruopeng
Zhou, Hao
Xie, Tunjun
Peng, Yuting
Chen, Xili
Yu, Guo
Wan, Fangying
Li, Lin
Zhang, Yongzhong
Liu, Ruifeng
PLANTS-BASEL, 2024, 13 (16):
[25] U2-Former: Nested U-Shaped Transformer for Image Restoration via Multi-View Contrastive Learning
Feng, Xin
Ji, Haobo
Pei, Wenjie
Li, Jinxing
Lu, Guangming
Zhang, David
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (01) : 168 - 181
[26] Breast cancer classification and segmentation framework using multiscale CNN and U-shaped dual decoded attention network
Umer, Muhammad Junaid
Sharif, Muhammad
Wang, Shui-Hua
EXPERT SYSTEMS, 2022,
[27] GLSANet: Global-Local Self-Attention Network for Remote Sensing Image Semantic Segmentation
Hu, Xudong
Zhang, Penglin
Zhang, Qi
Yuan, Feng
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[28] GLSANet: Global-Local Self-Attention Network for Remote Sensing Image Semantic Segmentation
Hu, Xudong
Zhang, Penglin
Zhang, Qi
Yuan, Feng
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[29] DAE-Former: Dual Attention-Guided Efficient Transformer for Medical Image Segmentation
Azad, Reza
Arimond, Rene
Aghdam, Ehsan Khodapanah
Kazerouni, Amirhossein
Merhof, Dorit
PREDICTIVE INTELLIGENCE IN MEDICINE, PRIME 2023, 2023, 14277 : 83 - 95
[30] TGDAUNet: Transformer and GCNN based dual-branch attention UNet for medical image segmentation
Song, Pengfei
Li, Jinjiang
Fan, Hui
Fan, Linwei
COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 167

← 1 2 3 4 5 →