Semantic Segmentation of Large-Size VHR Remote Sensing Images Using a Two-Stage Multiscale Training Architecture

被引:127
作者
Ding, Lei [1 ]
Zhang, Jing [2 ]
Bruzzone, Lorenzo [1 ]
机构
[1] Univ Trento, Dept Informat Engn & Comp Sci, I-38123 Trento, Italy
[2] Beijing Univ Technol, Dept Informat, Beijing 100124, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2020年 / 58卷 / 08期
关键词
Convolutional neural network; deep learning; remote sensing; semantic segmentation; CONVOLUTIONAL NETWORKS; NEURAL-NETWORK; RESOLUTION; CLASSIFICATION;
D O I
10.1109/TGRS.2020.2964675
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Very-high resolution (VHR) remote sensing images (RSIs) have significantly larger spatial size compared to typical natural images used in computer vision applications. Therefore, it is computationally unaffordable to train and test classifiers on these images at a full-size scale. Commonly used methodologies for semantic segmentation of RSIs perform training and prediction on cropped image patches. Thus, they have the limitation of failing to incorporate enough context information. In order to better exploit the correlations between ground objects, we propose a deep architecture with a two-stage multiscale training strategy that is tailored to the semantic segmentation of large-size VHR RSIs. In the first stage of the training strategy, a semantic embedding network is designed to learn high-level features from downscaled images covering a large area. In the second training stage, a local feature extraction network is designed to introduce low-level information from cropped image patches. The resulting training strategy is able to fuse complementary information learned from multiple levels to make predictions. Experimental results on two data sets show that it outperforms local-patch-based training models in terms of both accuracy and stability.
引用
收藏
页码:5367 / 5376
页数:10
相关论文
共 47 条
[1]  
Audebert N., 2016, ARXIV160906861
[2]   Beyond RGB: Very high resolution urban remote sensing with multimodal deep networks [J].
Audebert, Nicolas ;
Le Saux, Bertrand ;
Lefevre, Sebastien .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 140 :20-32
[3]   Joint Learning from Earth Observation and OpenStreetMap Data to Get Faster Better Semantic Maps [J].
Audebert, Nicolas ;
Le Saux, Bertrand ;
Lefevre, Sebastien .
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :1552-1560
[4]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[5]  
Chen L., 2014, arXiv preprint arXiv:1412.7062
[6]  
Chen L.-C., 2018, PROC EUR C COMPUT VI, P801, DOI DOI 10.1007/978-3-030-01234-2_49
[7]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[8]  
Everingham M, 2012, The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results
[9]  
Florian L-CCGP, 2017, arxiv preprint arxiv:1706.05587, DOI [DOI 10.48550/ARXIV.1706.05587, 10.48550/arXiv.1706.05587]
[10]  
He K, 2016, PROC CVPR IEEE, P770, DOI [10.1109/CVPR.2016.90, DOI 10.1109/CVPR.2016.90]