Semantic Segmentation of Urban Buildings from VHR Remote Sensing Imagery Using a Deep Convolutional Neural Network

被引:194
作者
Yi, Yaning [1 ,2 ]
Zhang, Zhijie [3 ]
Zhang, Wanchang [1 ]
Zhang, Chuanrong [3 ]
Li, Weidong [3 ]
Zhao, Tian [4 ]
机构
[1] Chinese Acad Sci, Inst Remote Sensing & Digital Earth, Key Lab Digital Earth Sci, Beijing 100094, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Univ Connecticut, Dept Geog, Storrs, CT 06269 USA
[4] Univ Wisconsin, Dept Comp Sci, Milwaukee, WI 53211 USA
关键词
semantic segmentation; urban building extraction; deep convolutional neural network; VHR remote sensing imagery; U-Net; AERIAL IMAGES; CLASSIFICATION; EXTRACTION; LIDAR; AREAS; SVM;
D O I
10.3390/rs11151774
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Urban building segmentation is a prevalent research domain for very high resolution (VHR) remote sensing; however, various appearances and complicated background of VHR remote sensing imagery make accurate semantic segmentation of urban buildings a challenge in relevant applications. Following the basic architecture of U-Net, an end-to-end deep convolutional neural network (denoted as DeepResUnet) was proposed, which can effectively perform urban building segmentation at pixel scale from VHR imagery and generate accurate segmentation results. The method contains two sub-networks: One is a cascade down-sampling network for extracting feature maps of buildings from the VHR image, and the other is an up-sampling network for reconstructing those extracted feature maps back to the same size of the input VHR image. The deep residual learning approach was adopted to facilitate training in order to alleviate the degradation problem that often occurred in the model training process. The proposed DeepResUnet was tested with aerial images with a spatial resolution of 0.075 m and was compared in performance under the exact same conditions with six other state-of-the-art networks-FCN-8s, SegNet, DeconvNet, U-Net, ResUNet and DeepUNet. Results of extensive experiments indicated that the proposed DeepResUnet outperformed the other six existing networks in semantic segmentation of urban buildings in terms of visual and quantitative evaluation, especially in labeling irregular-shape and small-size buildings with higher accuracy and entirety. Compared with the U-Net, the F1 score, Kappa coefficient and overall accuracy of DeepResUnet were improved by 3.52%, 4.67% and 1.72%, respectively. Moreover, the proposed DeepResUnet required much fewer parameters than the U-Net, highlighting its significant improvement among U-Net applications. Nevertheless, the inference time of DeepResUnet is slightly longer than that of the U-Net, which is subject to further improvement.
引用
收藏
页数:19
相关论文
共 66 条
[1]  
[Anonymous], 14091556 ARXIV
[2]  
[Anonymous], PROC CVPR IEEE
[3]  
[Anonymous], 2015, ARXIV PREPRINT ARXIV
[4]  
[Anonymous], 2017, IEEE T IND ELECTRON
[5]  
[Anonymous], 2017, ENET DEEP NEURAL NET
[6]  
[Anonymous], 2018, P C AR J GEOSC
[7]  
[Anonymous], 2015, PROC CVPR IEEE
[8]  
[Anonymous], ADV NEURAL INFORM PR
[9]  
[Anonymous], 2016, P IEEE INT S CIRC SY
[10]  
[Anonymous], 2015, P ICLR P ICLR P ICLR