Sub-pixel multi-scale fusion network for medical image segmentation

被引:0
作者
Jing Li [1 ]
Qiaohong Chen [1 ]
Xian Fang [1 ]
机构
[1] School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou
关键词
Adaptive gate; CNN; Medical image segmentation; Multi-scale fusion; Sub-pixel feature; Transformer;
D O I
10.1007/s11042-024-20338-0
中图分类号
学科分类号
摘要
CNNs and Transformers have significantly advanced the domain of medical image segmentation. The integration of their strengths facilitates rich feature extraction but also introduces the challenge of mixed multi-scale feature fusion. To overcome this issue, we propose an innovative deep medical image segmentation framework termed Sub-pixel Multi-scale Fusion Network (SMFNet), which effectively incorporates the sub-pixel multi-scale feature fusion results of CNN and Transformer into the architecture. In particular, our design consists of three effective and practical modules. Primarily, we utilize the Sub-pixel Convolutional Module to synchronize the extracted features at multiple scales to a consistent resolution. In the next place, we develop the Three-level Enhancement Module to learn features from adjacent layers and perform information exchange. Lastly, we leverage the Hierarchical Adaptive Gate to fuse information from other contextual levels through the Sub-pixel Convolutional Module. Extensive experiments on the Synapse, ACDC, and ISIC 2018 datasets demonstrate the effectiveness of the proposed SMFNet, and our method is superior to other competitive CNN-based or Transformer-based segmentation methods. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
引用
收藏
页码:89355 / 89373
页数:18
相关论文
共 41 条
  • [1] Ronneberger O., Fischer P., Brox T., ) U-net: Convolutional networks for biomedical image segmentation, In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18Th International Conference, pp. 234-241, (2015)
  • [2] Wang W., Xie E., Li X., Fan D.P., Song K., Liang D., Lu T., Luo P., Shao L., Pyramid vision transformer: A versatile backbone for dense prediction without convolutions, In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 568-578, (2021)
  • [3] Dosovitskiy A., Beyer L., Kolesnikov A., Weissenborn D., Zhai X., Unterthiner T., Dehghani M., Minderer M., Heigold G., Gelly S., Uszkoreit J., Houlsby N., An image is worth 16x16 words: Transformers for image recognition at scale. In, : 9Th International Conference on Learning Representations, (2021)
  • [4] Chen J., Lu Y., Yu Q., Luo X., Adeli E., Wang Y., Lu L., Yuille A.L., Zhou Y., Transunet: Transformers make strong encoders for medical image segmentation, Arxiv Preprint Arxiv, 2102, (2021)
  • [5] Zhang Z., Sun B., Zhang W., Pyramid medical transformer for medical image segmentation., (2021)
  • [6] Hatamizadeh A., Tang Y., Nath V., Yang D., Myronenko A., Landman B., Roth H.R.D., Unetr: Transformers for 3d medical image segmentation, In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574-584, (2022)
  • [7] Lin A., Chen B., Xu J., Zhang Z., Lu G., Zhang D., Ds-transunet: Dual swin transformer u-net for medical image segmentation, IEEE Trans Instrum Meas, 71, pp. 1-15, (2022)
  • [8] Gu J., Kwon H., Wang D., Ye W., Li M., Chen Y.H., Lai L., Chandra V., Pan D.Z., Multi-scale high-resolution vision transformer for semantic segmentation, In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12084-12093, (2022)
  • [9] Heidari M., Kazerouni A., Soltany M., Azad R., Aghdam E.K., Cohen-Adad J., Merhof D., Hiformer: Hierarchical multi-scale representations using transformers for medical image segmentation, In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 6202-6212, (2023)
  • [10] Long J., Shelhamer E., Darrell T., Fully convolutional networks for semantic segmentation, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440, (2015)