Aerial image semantic segmentation using DCNN predicted distance maps

被引:48
作者
Chai, Dengfeng [1 ]
Newsam, Shawn [2 ]
Huang, Jingfeng [3 ,4 ]
机构
[1] Zhejiang Univ, Sch Earth Sci, Hangzhou 310027, Peoples R China
[2] Univ Calif Merced, Elect Engn & Comp Sci, Merced, CA 95343 USA
[3] Zhejiang Univ, Inst Appl Remote Sensing & Informat Technol, Hangzhou 310058, Peoples R China
[4] Zhejiang Univ, Key Lab Agr Remote Sensing & Informat Syst, Hangzhou 310058, Peoples R China
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Deep learning; Semantic segmentation; DCNNs; Distance maps; Distance transform; NEURAL-NETWORKS; CLASSIFICATION;
D O I
10.1016/j.isprsjprs.2020.01.023
中图分类号
P9 [自然地理学];
学科分类号
0705 ; 070501 ;
摘要
This paper addresses the challenge of learning spatial context for the semantic segmentation of high-resolution aerial images using Deep Convolutional Neural Networks (DCNNs). The proposed solution involves deriving a signed distance map for each semantic class from a ground truth label map and training a DCNN to predict this distance map instead of a score map for each class. Since the distance between a target pixel and its nearest object boundary measures how far the pixel penetrates an object, the distance maps encode spatial context, particularly spatial smoothness. Positive pixel values in the distance maps correspond to the correct class and negative values correspond to the incorrect class. A final label map is derived from the predicted distance maps by selecting the class with the maximum distance. Since neighboring pixels in the distance maps have similar values, the segmentation results are smoother than current approaches. The results are shown to be even better than performing post-processing using fully connected Conditional Random Fields (CRFs), a common approach to smoothing the segmentations produced DCNNs. Experimental results on the semantic labeling challenge dataset show the proposed approach outperforms most state-of-the-art methods. Our main contribution, though, is the novel idea of replacing the pixel-wise class score maps of DCNNs with distance maps. This is therefore orthogonal and complementary to other techniques employed by the state-of-the-art methods and could therefore be used to improve upon them.
引用
收藏
页码:309 / 322
页数:14
相关论文
共 47 条
[1]  
[Anonymous], Neural Inf. Proc. Systems (NIPS)
[2]  
[Anonymous], 2009, IEEE I CONF COMP VIS, DOI 10.1109/ICCV.2009.5459175
[3]  
[Anonymous], 1990, Wavelets
[4]  
[Anonymous], 2014, USE STAIR VISION LIB
[5]  
[Anonymous], 2011, ADV NEURAL INFORM PR
[6]  
[Anonymous], ISPRS 2D SEMANTIC LA
[7]   Conditional Random Fields Meet Deep Neural Networks for Semantic Segmentation Combining probabilistic graphical models with deep learning for structured prediction [J].
Arnab, Anurag ;
Zheng, Shuai ;
Jayasumana, Sadeep ;
Romera-Paredes, Bernardino ;
Larsson, Mans ;
Kirillov, Alexander ;
Savchynskyy, Bogdan ;
Rother, Carsten ;
Kahl, Fredrik ;
Torr, Philip H. S. .
IEEE SIGNAL PROCESSING MAGAZINE, 2018, 35 (01) :37-52
[8]   Beyond RGB: Very high resolution urban remote sensing with multimodal deep networks [J].
Audebert, Nicolas ;
Le Saux, Bertrand ;
Lefevre, Sebastien .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 140 :20-32
[9]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[10]   Deep Watershed Transform for Instance Segmentation [J].
Bai, Min ;
Urtasun, Raquel .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2858-2866