MFVNet: a deep adaptive fusion network with multiple field-of-views for remote sensing image semantic segmentation

被引：62

作者：

Li, Yansheng ^{[1
]}

Chen, Wei ^{[1
]}

Huang, Xin ^{[1
]}

Gao, Zhi ^{[1
]}

Li, Siwei ^{[1
]}

He, Tao ^{[1
]}

Zhang, Yongjun ^{[1
]}

机构：

[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China

来源：

SCIENCE CHINA-INFORMATION SCIENCES | 2023年 / 66卷 / 04期

基金：

中国国家自然科学基金;

关键词：

semantic segmentation; remote sensing image (RSI); field-of-view (FOV); adaptive fusion; convolutional neural network;

D O I：

10.1007/s11432-022-3599-y

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, the remote sensing image (RSI) semantic segmentation attracts increasing research interest due to its wide application. RSIs are difficult to be processed holistically on current GPU cards on account of their large field-of-views (FOVs). However, the prevailing practices such as downsampling and cropping will inevitably decrease the quality of semantic segmentation. To address this conflict, this paper proposes a new deep adaptive fusion network with multiple FOVs (MFVNet), which is specially designed for RSI semantic segmentation. Different from existing methods, MFVNet takes into consideration the differences among multiple FOVs. By pyramid sampling the RSI, we first obtain images on different scales with multiple FOVs. Images on the high scale with a large FOV can capture larger spatial contexts and complete object contours, while images on the low scale with a small FOV can keep the higher spatial resolution and more detailed information. Then scale-specific models are chosen to make the best predictions for all scales. Next, the output feature maps and score maps are aligned through the scale alignment module to overcome spatial misregistration among scales. Finally, the aligned score maps are fused with the help of adaptive weight maps generated by the adaptive fusion module, producing the fused prediction. The performance of MFVNet surpasses the previous state-of-the-art semantic segmentation models on three typical RSI datasets, demonstrating the effectiveness of the proposed MFVNet.

引用

页数：14

共 55 条

[1] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[2] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].

Chen, Liang-Chieh ;

Zhu, Yukun ;

Papandreou, George ;

Schroff, Florian ;

Adam, Hartwig .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851

[3] All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification [J].

Chen, Weijie ;

Xie, Di ;

Zhang, Yuan ;

Pu, Shiliang .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7234-7243

[4] CascadePSP: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement [J].

Cheng, Ho Kei ;

Chung, Jihoon ;

Tai, Yu-Wing ;

Tang, Chi-Keung .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :8887-8896

[5] Human-centered concepts for exploration and understanding of earth observation images [J].

Datcu, M ;

Seidel, K .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2005, 43 (03) :601-609

[6]

Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, 10.48550/arxiv.1810.04805]

[7] Looking Outside the Window: Wide-Context Transformer for the Semantic Segmentation of High-Resolution Remote Sensing Images [J].

Ding, Lei ;

Lin, Dong ;

Lin, Shaofu ;

Zhang, Jing ;

Cui, Xiaojie ;

Wang, Yuebin ;

Tang, Hao ;

Bruzzone, Lorenzo .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60

[8] Adversarial Shape Learning for Building Extraction in VHR Remote Sensing Images [J].

Ding, Lei ;

Tang, Hao ;

Liu, Yahui ;

Shi, Yilei ;

Zhu, Xiao Xiang ;

Bruzzone, Lorenzo .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :678-690

[9] Semantic Segmentation of Large-Size VHR Remote Sensing Images Using a Two-Stage Multiscale Training Architecture [J].

Ding, Lei ;

Zhang, Jing ;

Bruzzone, Lorenzo .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (08) :5367-5376

[10]

Dosovitskiy Alexey, 2021, P ICLR

← 1 2 3 4 5 6 →