LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation

被引：127

作者：

Xu, Guoping ^{[1
]}

Zhang, Xuan ^{[1
]}

He, Xinwei ^{[2
]}

Wu, Xinglong ^{[1
]}

机构：

[1] Wuhan Inst Technol, Sch Comp Sci & Engn, Hubei Key Lab Intelligent Robot, Wuhan 430205, Hubei, Peoples R China

[2] Huazhong Agr Univ, Coll Informat, Wuhan 430070, Hubei, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII | 2024年 / 14432卷

关键词：

Medical Image Segmentation; Transformer; Convolutional Neural Network;

D O I：

10.1007/978-981-99-8543-2_4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Medical image segmentation plays an essential role in developing computer-assisted diagnosis and treatment systems, yet it still faces numerous challenges. In the past few years, Convolutional Neural Networks (CNNs) have been successfully applied to the task of medical image segmentation. Regrettably, due to the locality of convolution operations, these CNN-based architectures have their limitations in learning global context information in images, which might be crucial to the success of medical image segmentation. Meanwhile, the vision Transformer (ViT) architectures own the remarkable ability to extract long-range semantic features with the shortcoming of their computation complexity. To make medical image segmentation more efficient and accurate, we present a novel light-weight architecture named LeViT-UNet, which integrates multi-stage Transformer blocks in the encoder via LeViT, aiming to explore the effectiveness of fusion between local and global features together. Our experiments on two challenging segmentation benchmarks indicate that the proposed LeViT-UNet achieved competitive performance compared with various state-of-the-art methods in terms of efficiency and accuracy, suggesting that LeViT can be a faster feature encoder for medical images segmentation. LeViT-UNet-384, for instance, achieves Dice similarity coefficient (DSC) of 78.53% and 90.32% with a segmentation speed of 85 frames per second (FPS) in the Synapse and ACDC datasets, respectively. Therefore, the proposed architecture could be beneficial for prospective clinic trials conducted by the radiologists. Our source codes are publicly available at https://github.com/apple1986/LeViT_UNet.

引用

页码：42 / 53

页数：12

共 50 条

[41] MLFA-UNet: A multi-level feature assembly UNet for medical image segmentation
Garbaz, Anass
Oukdacha, Yassine
Charfi, Said
El Ansari, Mohamed
Koutti, Lahcen
Salihoun, Mouna
METHODS, 2024, 232 : 52 - 64
[42] VM-UNET-V2: Rethinking Vision Mamba UNet for Medical Image Segmentation
Zhang, Mingya
Yu, Yue
Jin, Sun
Gu, Limei
Ling, Tingsheng
Tao, Xianping
BIOINFORMATICS RESEARCH AND APPLICATIONS, PT I, ISBRA 2024, 2024, 14954 : 335 - 346
[43] Diffusion Transformer U-Net for Medical Image Segmentation
Chowdary, G. Jignesh
Yin, Zhaozheng
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT IV, 2023, 14223 : 622 - 631
[44] MR-Trans: MultiResolution Transformer for medical image segmentation
Zou, Yibo
Ge, Yan
Zhao, Linlin
Li, Wei
COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 165
[45] A swin-transformer-based network with inductive bias ability for medical image segmentation
Gao, Yan
Xu, Huan
Liu, Quanle
Bie, Mei
Che, Xiangjiu
APPLIED INTELLIGENCE, 2025, 55 (02)
[46] Automatic Medical Image Segmentation with Vision Transformer
Zhang, Jie
Li, Fan
Zhang, Xin
Wang, Huaijun
Hei, Xinhong
APPLIED SCIENCES-BASEL, 2024, 14 (07):
[47] CTRANS: A MULTI-RESOLUTION CONVOLUTION-TRANSFORMER NETWORK FOR MEDICAL IMAGE SEGMENTATION
Gong, Zhendi
French, Andrew P.
Qiu, Guoping
Chen, Xin
IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI 2024, 2024,
[48] MCRformer: Morphological constraint reticular transformer for 3D medical image segmentation
Li, Jun
Chen, Nan
Zhou, Han
Lai, Taotao
Dong, Heng
Feng, Chunhui
Chen, Riqing
Yang, Changcai
Cai, Fanggang
Wei, Lifang
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 232
[49] DI-Unet: Dimensional interaction self-attention for medical image segmentation
Wu, Yanlin
Wang, Guanglei
Wang, Zhongyang
Wang, Hongrui
Li, Yan
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 78
[50] NAS-Unet: Neural Architecture Search for Medical Image Segmentation
Weng, Yu
Zhou, Tianbao
Li, Yujie
Qiu, Xiaoyu
IEEE ACCESS, 2019, 7 : 44247 - 44257

← 1 2 3 4 5 →