Local-Global Multiscale Fusion Network for Semantic Segmentation of Buildings in SAR Imagery

被引:0
作者
Zhou, Xuanyu [1 ]
Zhou, Lifan [1 ]
Zhang, Haizhen [2 ,3 ]
Ji, Wei [4 ]
Zhou, Bei [1 ]
机构
[1] Changshu Inst Technol, Sch Comp Sci & Engn, Suzhou 215500, Peoples R China
[2] China Meteorol Adm, Natl Satellite Meteorol Ctr, Key Lab Radiometr Calibrat & Validat Environm Sate, Beijing 100081, Peoples R China
[3] Innovat Ctr FengYun Meteorol Satellite, Beijing 100081, Peoples R China
[4] Geovis Environm Technol Co Ltd, Technol Res Ctr, Beijing 100094, Peoples R China
基金
中国国家自然科学基金;
关键词
Radar polarimetry; Semantics; Semantic segmentation; Buildings; Transformers; Feature extraction; Task analysis; Convolutional neural networks (CNNs); dual encoder-decoder; semantic segmentation of buildings; synthetic aperture radar (SAR) images; transformer; HIGH-RESOLUTION SAR;
D O I
10.1109/JSTARS.2024.3379403
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The extraction of buildings from synthetic aperture radar (SAR) images poses a challenging task in the realm of remote sensing (RS). In recent years, convolutional neural networks (CNNs) have rapidly advanced and found application in the field of RS. Researchers have investigated the potential of CNNs for the semantic segmentation of SAR images, bringing excellent improvements. However, the semantic segmentation of buildings in SAR images still encounters challenges due to the high similarity between features of ground objects and buildings in SAR images, as well as the variability in building structures. In this article, we propose the local-global multiscale fusion network (LGMFNet), based on a dual encoder-decoder structure, for the semantic segmentation of buildings in SAR images. The proposed LGMFNet introduces an auxiliary encoder with a transformer structure to address the limitation of using the main encoder with a CNN structure for global modeling. To embed global dependencies hierarchically into the CNN, we designed the global-local semantic aggregation module (GLSM). The GLSM serves as a bridge between the dual encoders to achieve semantic guidance and coupling from the local to the global level. Furthermore, to bridge the semantic gap between different scales, we designed the multiscale feature fusion network (MSFN) as the decoder. MSFN achieves the interactive fusion of semantic information between various scales by constructing the multiscale feature fusion module. Experimental results demonstrate that the proposed LGMFNet achieves the mIoU of 91.17% on the BIGSARDATA 2023 AISAR competition dataset, outperforming the second-best method by a margin of 0.78%. This evidences the superiority of LGMFNet in comparison to other state-of-the-art methods.
引用
收藏
页码:7410 / 7421
页数:12
相关论文
共 43 条
  • [1] Brenner AR, 2008, IEEE T GEOSCI REMOTE, V46, P2971, DOI 10.1109/TGRS.2008.920911
  • [2] Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9
  • [3] An overview of satellite synthetic aperture radar remote sensing in archaeology: From site detection to monitoring
    Chen, Fulong
    Lasaponara, Rosa
    Masini, Nicola
    [J]. JOURNAL OF CULTURAL HERITAGE, 2017, 23 : 5 - 11
  • [4] CVCMFF Net: Complex-Valued Convolutional and Multifeature Fusion Network for Building Semantic Segmentation of InSAR Images
    Chen, Jiankun
    Qiu, Xiaolan
    Ding, Chibiao
    Wu, Yirong
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [5] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [6] An Efficient and Robust Framework for SAR Target Recognition by Hierarchically Fusing Global and Local Features
    Ding, Baiyuan
    Wen, Gongjian
    Ma, Conghui
    Yang, Xiaoliang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (12) : 5983 - 5995
  • [7] MP-ResNet: Multipath Residual Network for the Semantic Segmentation of High-Resolution PolSAR Images
    Ding, Lei
    Zheng, Kai
    Lin, Dong
    Chen, Yuxing
    Liu, Bing
    Li, Jiansheng
    Bruzzone, Lorenzo
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [8] Dosovitskiy A., 2021, INT C LEARN REPRESEN, P1
  • [9] Target Region Segmentation in SAR Vehicle Chip Image With ACM Net
    Feng, Sijia
    Ji, Kefeng
    Ma, Xiaojie
    Zhang, Linbin
    Kuang, Gangyao
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [10] Dual Attention Network for Scene Segmentation
    Fu, Jun
    Liu, Jing
    Tian, Haijie
    Li, Yong
    Bao, Yongjun
    Fang, Zhiwei
    Lu, Hanqing
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149