D-former: a U-shaped Dilated Transformer for 3D medical image segmentation

被引:0
作者
Wu, Yixuan [1 ]
Liao, Kuanlun [2 ]
Chen, Jintai [2 ]
Wang, Jinhong [2 ]
Chen, Danny Z. [3 ]
Gao, Honghao [4 ,5 ]
Wu, Jian [6 ,7 ]
机构
[1] Zhejiang Univ, Sch Med, Hangzhou 310030, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310058, Peoples R China
[3] Univ Notre Dame, Dept Comp Sci & Engn, Notre Dame, IN 46556 USA
[4] Gachon Univ, Coll Future Ind, Seongnam 13120, South Korea
[5] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
[6] Zhejiang Univ, Affiliated Hosp 2, Sch Med, Hangzhou 310058, Peoples R China
[7] Zhejiang Univ, Sch Publ Hlth, Hangzhou 310058, Peoples R China
基金
中国国家自然科学基金;
关键词
Medical image analysis; Segmentation; Transformer; Long-range dependency; Position encoding; NETWORKS; ATTENTION;
D O I
10.1007/s00521-022-07859-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Computer-aided medical image segmentation has been applied widely in diagnosis and treatment to obtain clinically useful information of shapes and volumes of target organs and tissues. In the past several years, convolutional neural network (CNN)-based methods (e.g., U-Net) have dominated this area, but still suffered from inadequate long-range information capturing. Hence, recent work presented computer vision Transformer variants for medical image segmentation tasks and obtained promising performances. Such Transformers modeled long-range dependency by computing pair-wise patch relations. However, they incurred prohibitive computational costs, especially on 3D medical images (e.g., CT and MRI). In this paper, we propose a new method called Dilated Transformer, which conducts self-attention alternately in local and global scopes for pair-wise patch relations capturing. Inspired by dilated convolution kernels, we conduct the global self-attention in a dilated manner, enlarging receptive fields without increasing the patches involved and thus reducing computational costs. Based on this design of Dilated Transformer, we construct a U-shaped encoder-decoder hierarchical architecture called D-Former for 3D medical image segmentation. Experiments on the Synapse and ACDC datasets show that our D-Former model, trained from scratch, outperforms various competitive CNN-based or Transformer-based segmentation models at a low computational cost without time-consuming per-training process.
引用
收藏
页码:1931 / 1944
页数:14
相关论文
共 50 条
  • [41] A 3D U-Net Based on a Vision Transformer for Radar Semantic Segmentation
    Zhang, Tongrui
    Fan, Yunsheng
    SENSORS, 2023, 23 (24)
  • [42] Deformable M-Reps for 3D Medical Image Segmentation
    Stephen M. Pizer
    P. Thomas Fletcher
    Sarang Joshi
    Andrew Thall
    James Z. Chen
    Yonatan Fridman
    Daniel S. Fritsch
    A. Graham Gash
    John M. Glotzer
    Michael R. Jiroutek
    Conglin Lu
    Keith E. Muller
    Gregg Tracton
    Paul Yushkevich
    Edward L. Chaney
    International Journal of Computer Vision, 2003, 55 : 85 - 106
  • [43] Hierarchical MRF of globally consistent localized classifiers for 3D medical image segmentation
    Park, Sang Hyun
    Lee, Soochahn
    Yun, Il Dong
    Lee, Sang Uk
    PATTERN RECOGNITION, 2013, 46 (09) : 2408 - 2419
  • [44] Medical image segmentation with 3D convolutional neural networks: A survey
    Niyas, S.
    Pawan, S. J.
    Kumar, M. Anand
    Rajan, Jeny
    NEUROCOMPUTING, 2022, 493 : 397 - 413
  • [45] A deep supervised transformer U-shaped full-resolution residual network for the segmentation of breast ultrasound image
    Zhou, Jiale
    Hou, Zuoxun
    Lu, Hongyan
    Wang, Wenhan
    Zhao, Wanchen
    Wang, Zenan
    Zheng, Dezhi
    Wang, Shuai
    Tang, Wenzhong
    Qu, Xiaolei
    MEDICAL PHYSICS, 2023, 50 (12) : 7513 - 7524
  • [46] 3DVT: Hyperspectral Image Classification Using 3D Dilated Convolution and Mean Transformer
    Su, Xinling
    Shao, Jingbo
    PHOTONICS, 2025, 12 (02)
  • [47] A Bispectral 3D U-Net for Rotation Robustness in Medical Segmentation
    Chevalley, Arthur
    Oreiller, Valentin
    Fageot, Julien
    Prior, John O.
    Andrearczyk, Vincent
    Depeursinge, Adrien
    TOPOLOGY-AND GRAPH-INFORMED IMAGING INFORMATICS, TGI3 2024, 2025, 15239 : 43 - 54
  • [48] MISSFormer: An Effective Transformer for 2D Medical Image Segmentation
    Huang, Xiaohong
    Deng, Zhifang
    Li, Dandan
    Yuan, Xueguang
    Fu, Ying
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (05) : 1484 - 1494
  • [49] Skin lesion image segmentation based on improved U-shaped network
    Zhao, Yuhang
    Yan, Tianxing
    Yilihamu, Yaermaimaiti
    INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS, 2024, 8 (03) : 609 - 618
  • [50] 3D Segmentation of Perivascular Spaces on T1-Weighted 3 Tesla MR Images With a Convolutional Autoencoder and a U-Shaped Neural Network
    Boutinaud, Philippe
    Tsuchida, Ami
    Laurent, Alexandre
    Adonias, Filipa
    Hanifehlou, Zahra
    Nozais, Victor
    Verrecchia, Violaine
    Lampe, Leonie
    Zhang, Junyi
    Zhu, Yi-Cheng
    Tzourio, Christophe
    Mazoyer, Bernard
    Joliot, Marc
    FRONTIERS IN NEUROINFORMATICS, 2021, 15