A diffusion model based on multi-scale spatial Mamba for medical image segmentation

被引:0
作者
Li, Chun [1 ]
Sun, Qiule [2 ]
Zhang, Muqing [3 ,4 ]
Zhang, Jianxin [4 ]
机构
[1] Wannan Med Coll, Sch Med Informat, Wuhu 241002, Anhui, Peoples R China
[2] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Liaoning, Peoples R China
[3] Dalian Univ Technol, Sch Informat & Commun Engn, Dalian 116024, Liaoning, Peoples R China
[4] Dalian Minzu Univ, Sch Comp Sci & Engn, Dalian 116600, Liaoning, Peoples R China
基金
中国国家自然科学基金;
关键词
Medical image segmentation; Denoising diffusion model; Mamba; Global features; State space model;
D O I
10.1016/j.engappai.2025.111028
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Medical image segmentation plays a key role in disease diagnosis, treatment planning, and monitoring disease progression. Recently, denoising diffusion models have shown significant promise in generating accurate pixel-level semantic representations. In this study, we introduce a diffusion model based on multi-scale spatial mamba (MSM-Diff) designed for precise medical image segmentation. MSM-Diff integrates the strengths of diffusion models, Mamba architecture, and convolutional neural networks to efficiently capture both global and local contextual information from complex volumetric data. The core of MSM-Diff is the Mamba-based U-shaped feature encoder (MUFE), which combines the three-dimensional multi-scale spatial Mamba model (MS-Mamba) with extracted features for enhanced multi-scale and global feature extraction. By using the mamba architecture, the model maintains linear computational complexity. Additionally, MSM-Diff incorporates a multi-scale gated spatial convolution (MS-GSC) module within MUFE to further refine spatial feature representations. Extensive evaluations of three public datasets demonstrate that MSM-Diff consistently outperforms current methods, achieving state-of-the-art performance in DSC and HD95. This model provides a robust solution for medical image segmentation by effectively capturing global context and accurately delineating boundaries, thereby improving diagnostic and treatment planning outcomes for radiologists and clinicians.
引用
收藏
页数:13
相关论文
共 53 条
[1]  
Azad R., 2022, arXiv
[2]   A benchmarking tool to evaluate computer tomography perfusion infarct core predictions against a DWI standard [J].
Cereda, Carlo W. ;
Christensen, Soren ;
Campbell, Bruce C. V. ;
Mishra, Nishant K. ;
Mlynash, Michael ;
Levi, Christopher ;
Straka, Matus ;
Wintermark, Max ;
Bammer, Roland ;
Albers, Gregory W. ;
Parsons, Mark W. ;
Lansberg, Maarten G. .
JOURNAL OF CEREBRAL BLOOD FLOW AND METABOLISM, 2016, 36 (10) :1780-1789
[3]  
Chen J., 2021, PREPRINT
[4]   MTDCNet: A 3D multi-threading dilated convolutional network for brain tumor automatic segmentation [J].
Chen, Wankun ;
Zhou, Weifeng ;
Zhu, Ling ;
Cao, Yuan ;
Gu, Haiming ;
Yu, Bin .
JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 133
[5]  
Cicek Ozgun, 2016, Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016. 19th International Conference. Proceedings: LNCS 9901, P424, DOI 10.1007/978-3-319-46723-8_49
[6]  
Dezhboro A, 2024, Arxiv, DOI [arXiv:2409.11665, DOI 10.48550/ARXIV.2409.11665]
[7]  
Dhariwal P, 2021, ADV NEUR IN, V34
[8]   Unsupervised Classification of Congenital Inner Ear Malformations Using DeepDiffusion for Latent Space Representation [J].
Diez, Paula Lopez ;
Margeta, Jan ;
Diab, Khassan ;
Patou, Francois ;
Paulsen, Rasmus R. .
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT V, 2023, 14224 :652-662
[9]   Diffusion model-based text-guided enhancement network for medical image segmentation [J].
Dong, Zhiwei ;
Yuan, Genji ;
Hua, Zhen ;
Li, Jinjiang .
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
[10]  
Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929