Local-Global Multiscale Fusion Network for Semantic Segmentation of Buildings in SAR Imagery

被引：0

作者：

Zhou, Xuanyu ^{[1
]}

Zhou, Lifan ^{[1
]}

Zhang, Haizhen ^{[2
,3
]}

Ji, Wei ^{[4
]}

Zhou, Bei ^{[1
]}

机构：

[1] Changshu Inst Technol, Sch Comp Sci & Engn, Suzhou 215500, Peoples R China

[2] China Meteorol Adm, Natl Satellite Meteorol Ctr, Key Lab Radiometr Calibrat & Validat Environm Sate, Beijing 100081, Peoples R China

[3] Innovat Ctr FengYun Meteorol Satellite, Beijing 100081, Peoples R China

[4] Geovis Environm Technol Co Ltd, Technol Res Ctr, Beijing 100094, Peoples R China

来源：

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING | 2024年 / 17卷

基金：

中国国家自然科学基金;

关键词：

Radar polarimetry; Semantics; Semantic segmentation; Buildings; Transformers; Feature extraction; Task analysis; Convolutional neural networks (CNNs); dual encoder-decoder; semantic segmentation of buildings; synthetic aperture radar (SAR) images; transformer; HIGH-RESOLUTION SAR;

D O I：

10.1109/JSTARS.2024.3379403

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The extraction of buildings from synthetic aperture radar (SAR) images poses a challenging task in the realm of remote sensing (RS). In recent years, convolutional neural networks (CNNs) have rapidly advanced and found application in the field of RS. Researchers have investigated the potential of CNNs for the semantic segmentation of SAR images, bringing excellent improvements. However, the semantic segmentation of buildings in SAR images still encounters challenges due to the high similarity between features of ground objects and buildings in SAR images, as well as the variability in building structures. In this article, we propose the local-global multiscale fusion network (LGMFNet), based on a dual encoder-decoder structure, for the semantic segmentation of buildings in SAR images. The proposed LGMFNet introduces an auxiliary encoder with a transformer structure to address the limitation of using the main encoder with a CNN structure for global modeling. To embed global dependencies hierarchically into the CNN, we designed the global-local semantic aggregation module (GLSM). The GLSM serves as a bridge between the dual encoders to achieve semantic guidance and coupling from the local to the global level. Furthermore, to bridge the semantic gap between different scales, we designed the multiscale feature fusion network (MSFN) as the decoder. MSFN achieves the interactive fusion of semantic information between various scales by constructing the multiscale feature fusion module. Experimental results demonstrate that the proposed LGMFNet achieves the mIoU of 91.17% on the BIGSARDATA 2023 AISAR competition dataset, outperforming the second-best method by a margin of 0.78%. This evidences the superiority of LGMFNet in comparison to other state-of-the-art methods.

引用

页码：7410 / 7421

页数：12

共 43 条

[1] Brenner AR, 2008, IEEE T GEOSCI REMOTE, V46, P2971, DOI 10.1109/TGRS.2008.920911
[2] Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9
[3] An overview of satellite synthetic aperture radar remote sensing in archaeology: From site detection to monitoring
Chen, Fulong
Lasaponara, Rosa
Masini, Nicola
[J]. JOURNAL OF CULTURAL HERITAGE, 2017, 23 : 5 - 11
[4] CVCMFF Net: Complex-Valued Convolutional and Multifeature Fusion Network for Building Semantic Segmentation of InSAR Images
Chen, Jiankun
Qiu, Xiaolan
Ding, Chibiao
Wu, Yirong
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[5] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
Chen, Liang-Chieh
Zhu, Yukun
Papandreou, George
Schroff, Florian
Adam, Hartwig
[J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
[6] An Efficient and Robust Framework for SAR Target Recognition by Hierarchically Fusing Global and Local Features
Ding, Baiyuan
Wen, Gongjian
Ma, Conghui
Yang, Xiaoliang
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (12) : 5983 - 5995
[7] MP-ResNet: Multipath Residual Network for the Semantic Segmentation of High-Resolution PolSAR Images
Ding, Lei
Zheng, Kai
Lin, Dong
Chen, Yuxing
Liu, Bing
Li, Jiansheng
Bruzzone, Lorenzo
[J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[8] Dosovitskiy A., 2021, INT C LEARN REPRESEN, P1
[9] Target Region Segmentation in SAR Vehicle Chip Image With ACM Net
Feng, Sijia
Ji, Kefeng
Ma, Xiaojie
Zhang, Linbin
Kuang, Gangyao
[J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[10] Dual Attention Network for Scene Segmentation
Fu, Jun
Liu, Jing
Tian, Haijie
Li, Yong
Bao, Yongjun
Fang, Zhiwei
Lu, Hanqing
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149

← 1 2 3 4 5 →