SSTrans-Net: Smart Swin Transformer Network for medical image segmentation

被引:17
作者
Fu, Liyao [1 ]
Chen, Yunzhu [1 ]
Ji, Wei [1 ]
Yang, Feng [1 ,2 ]
机构
[1] Guangxi Univ, Sch Comp & Elect & Informat, Nanning 530004, Guangxi, Peoples R China
[2] Guangxi Univ, Guangxi Key Lab Multimedia Commun Network Technol, Nanning 530004, Guangxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Medical image analysis; Medical image segmentation; Multi-organ segmentation; Swin transformer;
D O I
10.1016/j.bspc.2024.106071
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Medical image segmentation has achieved impressive results through some recent transformer-based works. Especially Swin Transformer has shown the superiority of the method in some segmentation tasks. However, the identical and fixed masks in the Swin Transformer prevent all interactions among ultra-long-range pixels in all channels. It is beneficial to capture long-range dependencies in some channels for multi-organ segmentation. In this paper, we propose a u-shaped Smart Swin Transformer Network (SSTrans-Net) for multiorgan segmentation. In SSTrans-Net, the Smart Shifted Window Multi-Head Self-Attention (SSW-MSA) module is used to replace the module based on the masks in Swin Transformer for learning the different channel-wise features, which focuses on the relevant dependencies among organs. Especially, keep an effective long-range dependency in the channels that exclusively focus on the target distribution and remove that dependency from the channels that concentrate on the local context. In addition, we introduce the Dice and Focal loss functions to supervise the optimization of the Smart Swin Transformer to improve its ability to balance global and local features. Experiments on Synapse and ACDC datasets demonstrate that our strategy requires less computational resources than most segmenters and can significantly improve the segmentation performance of the model. Our code is available at https://github.com/suofer/Smart-Swin-Transformer.
引用
收藏
页数:9
相关论文
共 41 条
  • [1] Azad R., 2022, arXiv
  • [2] DAE-Former: Dual Attention-Guided Efficient Transformer for Medical Image Segmentation
    Azad, Reza
    Arimond, Rene
    Aghdam, Ehsan Khodapanah
    Kazerouni, Amirhossein
    Merhof, Dorit
    [J]. PREDICTIVE INTELLIGENCE IN MEDICINE, PRIME 2023, 2023, 14277 : 83 - 95
  • [3] Beltagy I, 2020, Arxiv, DOI arXiv:2004.05150
  • [4] Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9
  • [5] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [6] Chen J., 2021, arXiv
  • [7] Child R, 2019, Arxiv, DOI arXiv:1904.10509
  • [8] Dosovitskiy A., 2021, ICLR
  • [9] Linear Deconfounded Score Method: Scoring DAGs With Dense Unobserved Confounding
    Bellot, Alexis
    van der Schaar, Mihaela
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 4948 - 4962
  • [10] UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation
    Gao, Yunhe
    Zhou, Mu
    Metaxas, Dimitris N.
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT III, 2021, 12903 : 61 - 71