SMESwin Unet: Merging CNN and Transformer for Medical Image Segmentation

被引:22
|
作者
Wang, Ziheng [1 ]
Min, Xiongkuo [2 ]
Shi, Fangyu [2 ]
Jin, Ruinian [1 ]
Nawrin, Saida S. [1 ]
Yu, Ichen [3 ]
Nagatomi, Ryoichi [1 ,3 ]
机构
[1] Tohoku Univ, Grad Sch Biomed Engn, Div Biomed Engn Hlth & Welf, Sendai, Japan
[2] Shanghai Jiao Tong Univ, Inst Image Commun & Network Engn, Shanghai, Peoples R China
[3] Tohoku Univ, Grad Sch Med, Dept Med & Sci Sports & Exercise, Sendai, Japan
来源
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT V | 2022年 / 13435卷
关键词
D O I
10.1007/978-3-031-16443-9_50
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Vision transformer is the new favorite paradigm in medical image segmentation since last year, which surpassed the traditional CNN counterparts in quantitative metrics. The significant advantage of ViTs is to utilize the attention layers to model global relations between tokens. However, the increased representation capacity of ViTs comes with corresponding shortcomings: short of CNN's inductive biases (locality), translation invariance, and hierarchical structure of visual information. Consequently, well-trained ViTs require more data than CNNs. As high quality data in medical imaging area is always limited, we propose SMESwin UNet. Firstly, based on Channel-wise Cross fusion Transformer (CCT) we fuse multi-scale semantic features and attention maps by designing a compound structure with CNN and ViTs (named MCCT). Secondly, we introduce superpixel by dividing the pixel-level feature into district-level to avoid the interference of meaningless parts of the image. Finally, we used External Attention to consider the correlations among all data samples, which may further reduce the limitation of small datasets. According to our experiments, the proposed superpixel and MCCT-based Swin Unet (SMESwin Unet) achieves better performance than CNNs and other Transformer-based architectures on three medical image segmentation datasets (nucleus, cells, and glands).
引用
收藏
页码:517 / 526
页数:10
相关论文
共 50 条
  • [1] TCI-UNet: transformer-CNN interactive module for medical image segmentation
    Bian, Xuan
    Wang, Guanglei
    Li, Yan
    Wang, Hongrui
    BIOMEDICAL OPTICS EXPRESS, 2023, 14 (11) : 5904 - 5920
  • [2] LATrans-Unet: Improving CNN-Transformer with Location Adaptive for Medical Image Segmentation
    Lin, Qiqin
    Yao, Junfeng
    Hong, Qingqi
    Cao, Xianpeng
    Zhou, Rongzhou
    Xie, Weixing
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XIII, 2024, 14437 : 223 - 234
  • [3] AFTer-UNet: Axial Fusion Transformer UNet for Medical Image Segmentation
    Yan, Xiangyi
    Tang, Hao
    Sun, Shanlin
    Ma, Haoyu
    Kong, Deying
    Xie, Xiaohui
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3270 - 3280
  • [4] AFC-Unet: Attention-fused full-scale CNN-transformer unet for medical image segmentation
    Meng, Wenjie
    Liu, Shujun
    Wang, Huajun
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 99
  • [5] TransCUNet: UNet cross fused transformer for medical image segmentation
    Jiang, Shen
    Li, Jinjiang
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 150
  • [6] CONVFORMER: COMBINING CNN AND TRANSFORMER FOR MEDICAL IMAGE SEGMENTATION
    Gu, Pengfei
    Zhang, Yejia
    Wang, Chaoli
    Chen, Danny Z.
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [7] CSWin-UNet: Transformer UNet with cross-shaped windows for medical image segmentation
    Liu, Xiao
    Gao, Peng
    Yu, Tao
    Wang, Fei
    Yuan, Ru-Yue
    INFORMATION FUSION, 2025, 113
  • [8] Remote Sensing Image Road Segmentation Method Integrating CNN-Transformer and UNet
    Wang, Rui
    Cai, Mingxiang
    Xia, Zixuan
    Zhou, Zhicui
    IEEE ACCESS, 2023, 11 : 144446 - 144455
  • [9] SEGTRANSVAE: HYBRID CNN - TRANSFORMER WITH REGULARIZATION FOR MEDICAL IMAGE SEGMENTATION
    Quan-Dung Pham
    Hai Nguyen-Truong
    Nam Nguyen Phuong
    Nguyen, Khoa N. A.
    Nguyen, Chanh D. T.
    Bui, Trung
    Truong, Steven Q. H.
    2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
  • [10] An effective CNN and Transformer complementary network for medical image segmentation
    Yuan, Feiniu
    Zhang, Zhengxiao
    Fang, Zhijun
    PATTERN RECOGNITION, 2023, 136