Coformer: Collaborative Transformer for Medical Image Segmentation

被引:0
作者
Gao, Yufei [1 ,2 ]
Zhang, Shichao [1 ,2 ]
Zhang, Dandan [3 ]
Shi, Yucheng [3 ]
Zhao, Guohua [4 ]
Shi, Lei [1 ,2 ]
机构
[1] Zhengzhou Univ, Sch Cyber Sci & Engn, Zhengzhou 460002, Peoples R China
[2] SongShan Lab, Zhengzhou 450001, Peoples R China
[3] Zhengzhou Univ, Sch Comp & Artificial Intelligence, Zhengzhou 450001, Peoples R China
[4] Zhengzhou Univ, Dept Magnet Resonance Imaging, Affiliated Hosp 1, Zhengzhou 450004, Peoples R China
来源
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024 | 2024年 / 14864卷
关键词
Transformer; Medical image segmentation; Cross attention;
D O I
10.1007/978-981-97-5588-2_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer has shown significant power for medical image analysis. However, the inherent design of the Transformer limits the ability to extract local features, thereby potentially affecting the performance. To address the above limitations, a Collaborative Transformer (Coformer) is proposed for medical image segmentation. In detail, the Multiscale Representation Fusion (MRF) module is designed to extract the semantic information of the fused features. During the encoding phase, local and global multi-scale feature representations are extracted by incorporating with Swin Transformer. Then, the semantic features are deeply extracted by the MRF module based on the cross-attention mechanism in the decoding phase. Comparative experiments on the well-known public Synapse Multi-Organ Segmentation dataset have demonstrated that Coformer achieves 82.39% dice score with 1.7%-12.62% improvements over the state-of-the-art methods.
引用
收藏
页码:240 / 250
页数:11
相关论文
共 20 条
[1]  
Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9
[2]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[3]   CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification [J].
Chen, Chun-Fu ;
Fan, Quanfu ;
Panda, Rameswar .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :347-356
[4]  
Chen J., 2021, arXiv
[5]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[6]  
Codella N, 2019, Arxiv, DOI arXiv:1902.03368
[7]  
Codella NCF, 2018, I S BIOMED IMAGING, P168, DOI 10.1109/ISBI.2018.8363547
[8]  
Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, 10.48550/arXiv.2010.11929, DOI 10.48550/ARXIV.2010.11929]
[9]   UNETR: Transformers for 3D Medical Image Segmentation [J].
Hatamizadeh, Ali ;
Tang, Yucheng ;
Nath, Vishwesh ;
Yang, Dong ;
Myronenko, Andriy ;
Landman, Bennett ;
Roth, Holger R. ;
Xu, Daguang .
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, :1748-1758
[10]   HiFormer: Hierarchical Multi-scale Representations Using Transformers for Medical Image Segmentation [J].
Heidari, Moein ;
Kazerouni, Amirhossein ;
Soltany, Milad ;
Azad, Reza ;
Aghdam, Ehsan Khodapanah ;
Cohen-Adad, Julien ;
Merhof, Dorit .
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, :6191-6201