RM-UNet: UNet-like Mamba with rotational SSM module for medical image segmentation

被引:5
作者
Tang, Hao [1 ]
Huang, Guoheng [1 ]
Cheng, Lianglun [1 ]
Yuan, Xiaochen [2 ]
Tao, Qi [3 ]
Chen, Xuhang [4 ]
Zhong, Guo [5 ]
Yang, Xiaohui [6 ]
机构
[1] Guangdong Univ Technol, Sch Comp Sci & Technol, Guangzhou 510006, Peoples R China
[2] Macao Polytech Univ, Fac Appl Sci, Macau 999078, Peoples R China
[3] Guangdong Technion Israel Inst Technol, Dept Mech Engn Robot, Shantou 515063, Peoples R China
[4] Huizhou Univ, Sch Comp Sci & Engn, Huizhou 516007, Peoples R China
[5] Guangdong Univ Foreign Studies, Sch Informat Sci & Technol, Guangzhou 510006, Peoples R China
[6] Sun Yat sen Univ, Affiliated Hosp 3, Dept Gynecol, Guangzhou, Peoples R China
关键词
U-Net; State Space Models; Medical image segmentation; Mamba; LSIL; U-NET ARCHITECTURE; TRANSFORMER;
D O I
10.1007/s11760-024-03484-8
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Accurate segmentation of tissues and lesions is crucial for disease diagnosis, treatment planning, and surgical navigation. Yet, the complexity of medical images presents significant challenges for traditional Convolutional Neural Networks and Transformer models due to their limited receptive fields or high computational complexity. State Space Models (SSMs) have recently shown notable vision performance, particularly Mamba and its variants. However, their feature extraction methods may not be sufficiently effective and retain some redundant structures, leaving room for parameter reduction. In response to these challenges, we introduce a methodology called Rotational Mamba-UNet, characterized by Residual Visual State Space (ResVSS) block and Rotational SSM Module. The ResVSS block is devised to mitigate network degradation caused by the diminishing efficacy of information transfer from shallower to deeper layers. Meanwhile, the Rotational SSM Module is devised to tackle the challenges associated with channel feature extraction within State Space Models. Finally, we propose a weighted multi-level loss function, which fully leverages the outputs of the decoder's three stages for supervision. We conducted experiments on ISIC17, ISIC18, CVC-300, Kvasir-SEG, CVC-ColonDB, Kvasir-Instrument datasets, and Low-grade Squamous Intraepithelial Lesion datasets provided by The Third Affiliated Hospital of Sun Yat-sen University, demonstrating the superior segmentation performance of our proposed RM-UNet. Additionally, compared to the previous VM-UNet, our model achieves a one-third reduction in parameters. Our code is available at https://github.com/Halo2Tang/RM-UNet.
引用
收藏
页码:8427 / 8443
页数:17
相关论文
共 54 条
[1]   ATTENTION SWIN U-NET: CROSS-CONTEXTUAL ATTENTION MECHANISM FOR SKIN LESION SEGMENTATION [J].
Aghdam, Ehsan Khodapanah ;
Azad, Reza ;
Zarvani, Maral ;
Merhof, Dorit .
2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
[2]  
[Anonymous], 2019, ARXIV
[3]   WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians [J].
Bernal, Jorge ;
Javier Sanchez, F. ;
Fernandez-Esparrach, Gloria ;
Gil, Debora ;
Rodriguez, Cristina ;
Vilarino, Fernando .
COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 43 :99-111
[4]   Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries [J].
Bray, Freddie ;
Laversanne, Mathieu ;
Sung, Hyuna ;
Ferlay, Jacques ;
Siegel, Rebecca L. ;
Soerjomataram, Isabelle ;
Jemal, Ahmedin .
CA-A CANCER JOURNAL FOR CLINICIANS, 2024, 74 (03) :229-263
[5]  
Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9
[6]  
Chen J., 2021, ARXIV
[7]  
Chen XB, 2023, IEEE T INSTRUM MEAS, V72, DOI [10.1109/TCOMM.2023.3244954, 10.1109/TIM.2023.3295011]
[8]  
CHOLLET F, 2017, PROC CVPR IEEE, P1800, DOI [DOI 10.1109/CVPR.2017.195, 10.1109/CVPR.2017.195]
[9]  
Codella NCF, 2018, I S BIOMED IMAGING, P168, DOI 10.1109/ISBI.2018.8363547
[10]  
Deng-Ping Fan, 2020, Medical Image Computing and Computer Assisted Intervention - MICCAI 2020. 23rd International Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12266), P263, DOI 10.1007/978-3-030-59725-2_26