VM-UNet plus plus research on crack image segmentation based on improved VM-UNet

被引:1
作者
Tang, Wenliang [1 ]
Wu, Ziyi [1 ]
Wang, Wei [1 ]
Pan, Youqin [1 ]
Gan, Weihua [2 ]
机构
[1] East China Jiaotong Univ, Sch Informat & Software Engn, Nanchang 330013, Peoples R China
[2] East China Jiaotong Univ, Sch Transportat & Logist, Nanchang 330013, Peoples R China
关键词
CNN; Transformer; Mamba; VM-UNet; Crack segmentation; VM-UNet plus plus;
D O I
10.1038/s41598-025-92994-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Cracks are common defects in physical structures, and if not detected and addressed in a timely manner, they can pose a severe threat to the overall safety of the structure. In recent years, with advancements in deep learning, particularly the widespread use of Convolutional Neural Networks (CNNs) and Transformers, significant breakthroughs have been made in the field of crack detection. However, CNNs still face limitations in capturing global information due to their local receptive fields when processing images. On the other hand, while Transformers are powerful in handling long-range dependencies, their high computational cost remains a significant challenge. To effectively address these issues, this paper proposes an innovative modification to the VM-UNet model. This modified model strategically integrates the strengths of the Mamba architecture and UNet to significantly improve the accuracy of crack segmentation. In this study, we optimized the original VM-UNet architecture to better meet the practical needs of crack segmentation tasks. Through comparative experiments on the Crack500 and Ozgenel public datasets, the results clearly demonstrate that the improved VM-UNet achieves significant advancements in segmentation accuracy. Compared to the original VM-UNet and other state-of-the-art models, VM-UNet++ shows a 3% improvement in mDS and a 4.6-6.2% increase in mIoU. These results fully validate the effectiveness of our improvement strategy. Additionally, VM-UNet++ demonstrates lower parameter count and floating-point operations, while maintaining a relatively satisfactory inference speed. These improvements make VM-UNet++ advantageous for practical applications.
引用
收藏
页数:13
相关论文
共 41 条
[1]  
Cao H., 2021, arXiv, DOI DOI 10.48550/ARXIV.2105.05537
[2]  
Chaurasia A, 2017, 2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP)
[3]  
Chen HRX, 2024, Arxiv, DOI arXiv:2404.03425
[4]  
Chen J., 2021, arXiv, DOI DOI 10.48550/ARXIV.2102.04306
[5]  
Chen ZH, 2024, Arxiv, DOI arXiv:2406.16518
[6]   An average pooling designed Transformer for robust crack segmentation [J].
Chen, Zhaohui ;
Shamsabadi, Elyas Asadi ;
Jiang, Sheng ;
Shen, Luming ;
Dias-da-Costa, Daniel .
AUTOMATION IN CONSTRUCTION, 2024, 162
[7]   SDDNet: Real-Time Crack Segmentation [J].
Choi, Wooram ;
Cha, Young-Jin .
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2020, 67 (09) :8016-8025
[8]   CNN: A vision of complexity [J].
Chua, LO .
INTERNATIONAL JOURNAL OF BIFURCATION AND CHAOS, 1997, 7 (10) :2219-2425
[9]   CM-Unet: A Novel Remote Sensing Image Segmentation Method Based on Improved U-Net [J].
Cui, Mengtian ;
Li, Kai ;
Chen, Jianying ;
Yu, Wei .
IEEE ACCESS, 2023, 11 :56994-57005
[10]  
Dao T, 2022, ADV NEUR IN