MiM-UNet: An efficient building image segmentation network integrating state space models

被引：0

作者：

Liu, Dong ^{[1
,2
,3
]}

Wang, Zhiyong ^{[2
,4
]}

Liang, Ankai ^{[5
]}

机构：

[1] Shandong Youth Univ Polit Sci, Sch Informat Engn, Jinan 250103, Peoples R China

[2] Shandong Prov Engn Res Ctr New Qual Prod & Data As, Jinan, Peoples R China

[3] New Technol Res & Dev Ctr Intelligent Informat Con, Jinan, Peoples R China

[4] Shandong Youth Univ Polit Sci, Sch Accountancy, Jinan 250103, Peoples R China

[5] TikTok Inc, San Jose, CA 95110 USA

来源：

ALEXANDRIA ENGINEERING JOURNAL | 2025年 / 120卷

关键词：

Building segmentation; Complex terrain; State space models; Remote sensing images; Deep learning; U-NET ARCHITECTURE; SEMANTIC SEGMENTATION;

D O I：

10.1016/j.aej.2025.02.035

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

With the advancement of remote sensing technology, the analysis of complex terrain images has become crucial for urban planning and geographic information extraction. However, existing models face significant challenges in processing intricate building structures: Transformer-based models suffer from high computational complexity and memory demands, while Convolutional Neural Networks (CNNs) often struggle to capture features across multiple scales and hierarchical levels. To address these limitations, we propose a novel architecture, Mamba-in-Mamba U-Net (MiM-UNet), which integrates the design principles of state-space models (SSMs) to enhance both computational efficiency and feature extraction capacity. Specifically, MiM-UNet refines the traditional encoder-decoder framework by introducing Mamba-in-Mamba blocks, enabling precise multi-scale feature capture and efficient information fusion. Experimental results demonstrate that MiM-UNet outperforms state-of-the-art models in segmentation accuracy on the Massachusetts building dataset, while substantially reducing computational overhead, highlighting its superior performance and promising potential for practical applications.

引用

页码：648 / 656

页数：9

共 49 条

[1] An ensemble architecture of deep convolutional Segnet and Unet networks for building semantic segmentation from high-resolution aerial images
Abdollahi, Abolfazl
Pradhan, Biswajeet
Alamri, Abdullah M.
[J]. GEOCARTO INTERNATIONAL, 2022, 37 (12) : 3355 - 3370
[2] Alexakis EB., 2020, INT ARCH PHOTOGRAMM, V43, P1507, DOI [DOI 10.5194/ISPRS-ARCHIVES-XLIII-B3-2020-1507-2020, 10.5194/isprs-archives-XLIII-B3-2020-1507-2020]
[3] [Anonymous], 2013, Machine learning for aerial image labeling
[4] Comparative research on different backbone architectures of DeepLabV3+for building segmentation
Atik, Saziye Ozge
Atik, Muhammed Enes
Ipbuker, Cengizhan
[J]. JOURNAL OF APPLIED REMOTE SENSING, 2022, 16 (02)
[5] Connected-UNets: a deep learning architecture for breast mass segmentation
Baccouche, Asma
Garcia-Zapirain, Begonya
Olea, Cristian Castillo
Elmaghraby, Adel S.
[J]. NPJ BREAST CANCER, 2021, 7 (01)
[6] Hybrid metaheuristics for selective inference task offloading under time and energy constraints for real-time IoT sensing systems
Ben Sada, Abdelkarim
Khelloufi, Amar
Naouri, Abdenacer
Ning, Huansheng
Dhelim, Sahraoui
[J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (09): : 12965 - 12981
[7] Bharati Puja, 2020, Computational Intelligence in Pattern Recognition. Proceedings of CIPR 2019. Advances in Intelligent Systems and Computing (AISC 999), P657, DOI 10.1007/978-981-13-9042-5_56
[8] Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9
[9] TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers
Chen, Jieneng
Mei, Jieru
Li, Xianhang
Lu, Yongyi
Yu, Qihang
Wei, Qingyue
Luo, Xiangde
Xie, Yutong
Adeli, Ehsan
Wang, Yan
Lungren, Matthew P.
Zhang, Shaoting
Xing, Lei
Lu, Le
Yuille, Alan
Zhou, Yuyin
[J]. MEDICAL IMAGE ANALYSIS, 2024, 97
[10] Chen T., 2024, arXiv, DOI arXiv:2403.02148

← 1 2 3 4 5 →