LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation

被引:127
作者
Xu, Guoping [1 ]
Zhang, Xuan [1 ]
He, Xinwei [2 ]
Wu, Xinglong [1 ]
机构
[1] Wuhan Inst Technol, Sch Comp Sci & Engn, Hubei Key Lab Intelligent Robot, Wuhan 430205, Hubei, Peoples R China
[2] Huazhong Agr Univ, Coll Informat, Wuhan 430070, Hubei, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII | 2024年 / 14432卷
关键词
Medical Image Segmentation; Transformer; Convolutional Neural Network;
D O I
10.1007/978-981-99-8543-2_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Medical image segmentation plays an essential role in developing computer-assisted diagnosis and treatment systems, yet it still faces numerous challenges. In the past few years, Convolutional Neural Networks (CNNs) have been successfully applied to the task of medical image segmentation. Regrettably, due to the locality of convolution operations, these CNN-based architectures have their limitations in learning global context information in images, which might be crucial to the success of medical image segmentation. Meanwhile, the vision Transformer (ViT) architectures own the remarkable ability to extract long-range semantic features with the shortcoming of their computation complexity. To make medical image segmentation more efficient and accurate, we present a novel light-weight architecture named LeViT-UNet, which integrates multi-stage Transformer blocks in the encoder via LeViT, aiming to explore the effectiveness of fusion between local and global features together. Our experiments on two challenging segmentation benchmarks indicate that the proposed LeViT-UNet achieved competitive performance compared with various state-of-the-art methods in terms of efficiency and accuracy, suggesting that LeViT can be a faster feature encoder for medical images segmentation. LeViT-UNet-384, for instance, achieves Dice similarity coefficient (DSC) of 78.53% and 90.32% with a segmentation speed of 85 frames per second (FPS) in the Synapse and ACDC datasets, respectively. Therefore, the proposed architecture could be beneficial for prospective clinic trials conducted by the radiologists. Our source codes are publicly available at https://github.com/apple1986/LeViT_UNet.
引用
收藏
页码:42 / 53
页数:12
相关论文
共 50 条
  • [21] ResTrans-Unet: A Residual-Aware Transformer-Based Approach to Medical Image Segmentation
    Ma, Fengying
    Wang, Zhi
    Ji, Peng
    Fu, Chengcai
    Wang, Feng
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2024, 34 (04)
  • [22] CoTrFuse: a novel framework by fusing CNN and transformer for medical image segmentation
    Chen, Yuanbin
    Wang, Tao
    Tang, Hui
    Zhao, Longxuan
    Zhang, Xinlin
    Tan, Tao
    Gao, Qinquan
    Du, Min
    Tong, Tong
    PHYSICS IN MEDICINE AND BIOLOGY, 2023, 68 (17)
  • [23] DMFC-UFormer: Depthwise multi-scale factorized convolution transformer-based UNet for medical image segmentation
    Garbaz, Anass
    Oukdach, Yassine
    Charfi, Said
    El Ansari, Mohamed
    Koutti, Lahcen
    Salihoun, Mouna
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 101
  • [24] RM-UNet: UNet-like Mamba with rotational SSM module for medical image segmentation
    Tang, Hao
    Huang, Guoheng
    Cheng, Lianglun
    Yuan, Xiaochen
    Tao, Qi
    Chen, Xuhang
    Zhong, Guo
    Yang, Xiaohui
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (11) : 8427 - 8443
  • [25] Medical Image Segmentation Based on Transformer and HarDNet Structures
    Shen, Tongping
    Xu, Huanqing
    IEEE ACCESS, 2023, 11 : 16621 - 16630
  • [26] A parallelly contextual convolutional transformer for medical image segmentation
    Feng, Yuncong
    Su, Jianyu
    Zheng, Jian
    Zheng, Yupeng
    Zhang, Xiaoli
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 98
  • [27] Token Sparsification for Faster Medical Image Segmentation
    Zhou, Lei
    Liu, Huidong
    Bae, Joseph
    He, Junjun
    Samaras, Dimitris
    Prasanna, Prateek
    INFORMATION PROCESSING IN MEDICAL IMAGING, IPMI 2023, 2023, 13939 : 743 - 754
  • [28] LKM-UNet: Large Kernel Vision Mamba UNet for Medical Image Segmentation
    Wang, Jinhong
    Chen, Jintai
    Chen, Danny
    Wu, Jian
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT VIII, 2024, 15008 : 360 - 370
  • [29] CoT-UNet plus plus : A medical image segmentation method based on contextual transformer and dense connection
    Yin, Yijun
    Xu, Wenzheng
    Chen, Lei
    Wu, Hao
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (05) : 8320 - 8336
  • [30] STU3: Multi-organ CT Medical Image Segmentation Model Based on Transformer and UNet
    Zheng, Wenjin
    Li, Bo
    Chen, Wanyi
    ARTIFICIAL INTELLIGENCE, CICAI 2023, PT I, 2024, 14473 : 170 - 181