LATrans-Unet: Improving CNN-Transformer with Location Adaptive for Medical Image Segmentation

被引:0
作者
Lin, Qiqin [1 ]
Yao, Junfeng [1 ,2 ,3 ]
Hong, Qingqi [1 ,3 ,4 ]
Cao, Xianpeng [1 ]
Zhou, Rongzhou [1 ]
Xie, Weixing [1 ]
机构
[1] Xiamen Univ, Sch Film, Sch Informat, Ctr Digital Media Comp, Xiamen 361005, Peoples R China
[2] Minist Culture & Tourism, Key Lab Digital Protect & Intelligent Proc Intang, Xiamen, Peoples R China
[3] Xiamen Univ, Inst Artificial Intelligence, Xiamen 361005, Peoples R China
[4] Hong Kong Ctr Cerebrocardiovasc Hlth Engn COCHE, Hong Kong, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XIII | 2024年 / 14437卷
关键词
Medical image segmentation; Transformer; Location information; Skip connection; NET;
D O I
10.1007/978-981-99-8558-6_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have been widely employed in medical image segmentation. While CNNs excel in local feature encoding, their ability to capture long-range dependencies is limited. In contrast, ViTs have strong global modeling capabilities. However, existing attention-based ViT models face difficulties in adaptively preserving accurate location information, rendering them unable to handle variations in important information within medical images. To inherit the merits of CNN and ViT while avoiding their respective limitations, we propose a novel framework called LATrans-Unet. By comprehensively enhancing the representation of information in both shallow and deep levels, LATrans-Unet maximizes the integration of location information and contextual details. In the shallow levels, based on a skip connection called SimAM-skip, we emphasize information boundaries and bridge the encoder-decoder semantic gap. Additionally, to capture organ shape and location variations in medical images, we propose Location-Adaptive Attention in the deep levels. It enables accurate segmentation by guiding the model to track changes globally and adaptively. Extensive experiments on multi-organ and cardiac segmentation tasks validate the superior performance of LATrans-Unet compared to previous state-of-the-art methods. The codes and trained models will be available soon.
引用
收藏
页码:223 / 234
页数:12
相关论文
共 50 条
  • [21] From CNN to Transformer: A Review of Medical Image Segmentation Models
    Yao, Wenjian
    Bai, Jiajun
    Liao, Wei
    Chen, Yuheng
    Liu, Mengjuan
    Xie, Yao
    JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2024, 37 (04): : 1529 - 1547
  • [22] TransCUNet: UNet cross fused transformer for medical image segmentation
    Jiang, Shen
    Li, Jinjiang
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 150
  • [23] STA-Former: enhancing medical image segmentation with Shrinkage Triplet Attention in a hybrid CNN-Transformer model
    Liu, Yuzhao
    Han, Liming
    Yao, Bin
    Li, Qing
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (02) : 1901 - 1910
  • [24] LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation
    Xu, Guoping
    Zhang, Xuan
    He, Xinwei
    Wu, Xinglong
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII, 2024, 14432 : 42 - 53
  • [25] HAU-Net: Hybrid CNN-transformer for breast ultrasound image segmentation
    Zhang, Huaikun
    Lian, Jing
    Yi, Zetong
    Wu, Ruichao
    Lu, Xiangyu
    Ma, Pei
    Ma, Yide
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 87
  • [26] FET-UNet: Merging CNN and transformer architectures for superior breast ultrasound image segmentation
    Zhang, Huaikun
    Lian, Jing
    Ma, Yide
    PHYSICA MEDICA-EUROPEAN JOURNAL OF MEDICAL PHYSICS, 2025, 133
  • [27] STA-Former: enhancing medical image segmentation with Shrinkage Triplet Attention in a hybrid CNN-Transformer model
    Yuzhao Liu
    Liming Han
    Bin Yao
    Qing Li
    Signal, Image and Video Processing, 2024, 18 : 1901 - 1910
  • [28] Cross Attention Multi Scale CNN-Transformer Hybrid Encoder Is General Medical Image Learner
    Zhou, Rongzhou
    Yao, Junfeng
    Hong, Qingqi
    Li, Xingxin
    Cao, Xianpeng
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XIII, 2024, 14437 : 85 - 97
  • [29] Semhybridnet: a semantically enhanced hybrid CNN-transformer network for radar pulse image segmentation
    Liu, Hongjia
    Xiao, Yubin
    Wu, Xuan
    Li, Yuanshu
    Zhao, Peng
    Liang, Yanchun
    Wang, Liupu
    Zhou, You
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (02) : 2851 - 2868
  • [30] CSWin-UNet: Transformer UNet with cross-shaped windows for medical image segmentation
    Liu, Xiao
    Gao, Peng
    Yu, Tao
    Wang, Fei
    Yuan, Ru-Yue
    INFORMATION FUSION, 2025, 113