Medical Image Segmentation Based on Multi-Scale Convolution Modulation

被引:0
作者
Zhou, Xin-Min [1 ,2 ]
Xiong, Zhi-Mou [3 ]
Shi, Chang-Fa [4 ,5 ]
Yang, Jian [3 ]
机构
[1] School of Artificial Intelligence and Advanced Computing, Hunan University of Technology and Business, Hunan, Changsha
[2] Xiangjiang Laboratory, Hunan, Changsha
[3] School of Computer Science, Hunan University of Technology and Business, Hunan, Changsha
[4] School of Intelligent Engineering and Intelligent Manufacturing, Hunan University of Technology and Business, Hunan, Changsha
[5] Changsha Social Laboratory of Artificial Intelligence, Hunan University of Technology and Business, Hunan, Changsha
来源
Tien Tzu Hsueh Pao/Acta Electronica Sinica | 2024年 / 52卷 / 09期
基金
中国国家自然科学基金;
关键词
convolutional modulation; medical image segmentation; multi-scale; Transformer;
D O I
10.12263/DZXB.20231068
中图分类号
学科分类号
摘要
Currently, more and more medical image segmentation models are using Transformer as their basic structure. However, the computational complexity of the Transformer model is quadratic with respect to the input sequence, and it requires a large amount of data for pre-training in order to achieve good results. In situations where there is insufficient data, the Transformer's advantages cannot be fully realized. Additionally, the Transformer often fails to effectively extract local information from images. In contrast, convolutional neural networks can effectively avoid these two problems. In order to fully leverage the strengths of both convolutional neural networks and Transformers and further explore the potential of convolutional neural networks, this paper proposes a multi-scale convolution modulation network (MSCMNet) model. This model incorporates the design methodology of visual Transformer models into traditional convolutional networks. By using convolution modulation and multi-scale feature extraction strategies, a feature extraction module based on multi-scale convolution modulation (MSCM) is constructed. Efficient patch combination and patch decomposition strategies are also proposed for downsampling and upsampling of feature maps, respectively, further enhancing the model's representation ability. The mDice scores obtained on four different types and sizes of medical image segmentation datasets - multiple organs in the abdomen, heart, skin cancer, and nucleus - are 0.805 7, 0.923 3, 0.923 9 and 0.854 8, respectively. With lower computational complexity and parameter count, MSCMNet achieves the best segmentation performance, providing a novel and efficient model structure design paradigm for convolutional neural networks and Transformers in the field of medical image segmentation. © 2024 Chinese Institute of Electronics. All rights reserved.
引用
收藏
页码:3159 / 3171
页数:12
相关论文
共 34 条
  • [1] ZHENG G Y, LIU X B, HAN G H., Survey on medical image computer aided detection and diagnosis systems, Journal of Software, 29, 5, pp. 1471-1514, (2018)
  • [2] LE C Y, BOSER B, DENKER J S, Et al., Handwritten digit recognition with a back-propagation network, Advances in Neural Information Processing Systems 2, pp. 396-404, (1990)
  • [3] RONNEBERGER O, FISCHER P, BROX T., U-Net: Convolutional networks for biomedical image segmentation, Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015, pp. 234-241, (2015)
  • [4] YIN X H, WANG Y C, LI D Y., Suvery of medical image segmentation technology based on U-net structure improvement, Journal of Software, 32, 2, pp. 519-550, (2021)
  • [5] ZHOU T, HUO B Q, LU H L, Et al., Research on residual neural network and its application on medical image processing, Acta Electronica Sinica, 48, 7, pp. 1436-1447, (2020)
  • [6] LIU J P, WU J J, ZHANG R, Et al., Toward automated segmentation of COVID-19 chest CT images based on structural reparameterization and multi-scale deep supervision, Acta Electronica Sinica, 51, 5, pp. 1163-1171, (2023)
  • [7] ZHOU Z W, RAHMAN SIDDIQUEE M M, TAJBAKHSH N, Et al., UNet++: A nested U-Net architecture for medical image segmentation, Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp. 3-11, (2018)
  • [8] OKTAY O, SCHLEMPER J, FOLGOC L L, Et al., Attention U-Net: Learning where to look for the pancreas
  • [9] ZHANG S J, PENG Z, LI H., SAU-net: Medical image segmentation method based on U-net and self-attention, Acta Electronica Sinica, 50, 10, pp. 2433-2442, (2022)
  • [10] CICEK O, ABDULKADIR A, LIENKAMP S S, Et al., 3D U-Net: Learning dense volumetric segmentation from sparse annotation, Medical Image Computing and Computer-Assisted Intervention-MICCAI 2016, pp. 424-432, (2016)