SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation

被引:5
|
作者
Perera, Shehan [1 ]
Navard, Pouyan [1 ]
Yilmaz, Alper [1 ]
机构
[1] Ohio State Univ, Photogrammetr Comp Vis Lab, Columbus, OH 43210 USA
关键词
D O I
10.1109/CVPRW63382.2024.00503
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The adoption of Vision Transformers (ViTs) based architectures represents a significant advancement in 3D Medical Image (MI) segmentation, surpassing traditional Convolutional Neural Network (CNN) models by enhancing global contextual understanding. While this paradigm shift has significantly enhanced 3D segmentation performance, state-of-the-art architectures require extremely large and complex architectures with large scale computing resources for training and deployment. Furthermore, in the context of limited datasets, often encountered in medical imaging, larger models can present hurdles in both model generalization and convergence. In response to these challenges and to demonstrate that lightweight models are a valuable area of research in 3D medical imaging, we present SegFormer3D, a hierarchical Transformer that calculates attention across multiscale volumetric features. Additionally, SegFormer3D avoids complex decoders and uses an all-MLP decoder to aggregate local and global attention features to produce highly accurate segmentation masks. The proposed memory efficient Transformer preserves the performance characteristics of a significantly larger model in a compact design. SegFormer3D democratizes deep learning for 3D medical image segmentation by offering a model with 33x less parameters and a 13x reduction in GFLOPS compared to the current state-of-the-art (SOTA). We benchmark SegFormer3D against the current SOTA models on three widely used datasets Synapse, BRaTs, and ACDC, achieving competitive results. Code: https://github.com/OSUPCVLab/SegFormer3D.git
引用
收藏
页码:4981 / 4988
页数:8
相关论文
共 50 条
  • [21] Efficient 3D Deep Learning Model for Medical Image Semantic Segmentation
    Alalwan, Nasser
    Abozeid, Amr
    ElHabshy, AbdAllah A.
    Alzahrani, Ahmed
    ALEXANDRIA ENGINEERING JOURNAL, 2021, 60 (01) : 1231 - 1239
  • [22] Dynamic Linear Transformer for 3D Biomedical Image Segmentation
    Zhang, Zheyuan
    Bagci, Ulas
    MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2022, 2022, 13583 : 171 - 180
  • [23] A hybrid framework for 3D medical image segmentation
    Chen, T
    Metaxas, D
    MEDICAL IMAGE ANALYSIS, 2005, 9 (06) : 547 - 565
  • [24] UNETR: Transformers for 3D Medical Image Segmentation
    Hatamizadeh, Ali
    Tang, Yucheng
    Nath, Vishwesh
    Yang, Dong
    Myronenko, Andriy
    Landman, Bennett
    Roth, Holger R.
    Xu, Daguang
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1748 - 1758
  • [25] Abstract: 3D Medical Image Segmentation with Transformer-based Scaling of ConvNets MedNeXt
    Roy, Saikat
    Koehler, Gregor
    Baumgartner, Michael
    Ulrich, Constantin
    Isensee, Fabian
    Jaeger, Paul F.
    Maier-Hein, Klaus
    BILDVERARBEITUNG FUR DIE MEDIZIN 2024, 2024, : 79 - 79
  • [26] 3D bi-directional transformer U-Net for medical image segmentation
    Fu, Xiyao
    Sun, Zhexian
    Tang, Haoteng
    Zou, Eric M.
    Huang, Heng
    Wang, Yong
    Zhan, Liang
    FRONTIERS IN BIG DATA, 2023, 5
  • [27] HCA-former: Hybrid Convolution Attention Transformer for 3D Medical Image Segmentation
    Yang, Fan
    Wang, Fan
    Dong, Pengwei
    Wang, Bo
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 90
  • [28] SPCTNet: A Series-Parallel CNN and Transformer Network for 3D Medical Image Segmentation
    Yu, Bin
    Zhou, Quan
    Zhang, Xuming
    ARTIFICIAL INTELLIGENCE, CICAI 2023, PT I, 2024, 14473 : 376 - 387
  • [29] Efficient 3D medical image segmentation algorithm over a secured multimedia network
    Shadi Al-Zu’bi
    Bilal Hawashin
    Ala Mughaid
    Thar Baker
    Multimedia Tools and Applications, 2021, 80 : 16887 - 16905
  • [30] Efficient 3D medical image segmentation algorithm over a secured multimedia network
    Al-Zu'bi, Shadi
    Hawashin, Bilal
    Mughaid, Ala
    Baker, Thar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16887 - 16905