Hourglass-ShapeNetwork Based Semantic Segmentation for High Resolution Aerial Imagery

被引:117
作者
Liu, Yu [1 ,2 ]
Duc Minh Nguyen [1 ]
Deligiannis, Nikos [1 ]
Ding, Wenrui [2 ,3 ]
Munteanu, Adrian [1 ]
机构
[1] Vrije Univ Brussel, ETRO Dept, Pl Laan 2, B-1050 Brussels, Belgium
[2] Beihang Univ, Sch Elect & Informat Engn, 37 Xueyuan Rd, Beijing 100191, Peoples R China
[3] Collaborat Innovat Ctr Geospatial Technol, Wuhan 430079, Peoples R China
关键词
semantic labeling; convolutional neural networks; remote sensing; deep learning; aerial images; SCENE; REPRESENTATION; FEATURES;
D O I
10.3390/rs9060522
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
A new convolution neural network (CNN) architecture for semantic segmentation of high resolution aerial imagery is proposed in this paper. The proposed architecture follows an hourglass-shaped network (HSN) design being structured into encoding and decoding stages. By taking advantage of recent advances in CNN designs, we use the composed inception module to replace common convolutional layers, providing the network with multi-scale receptive areas with rich context. Additionally, in order to reduce spatial ambiguities in the up-sampling stage, skip connections with residual units are also employed to feed forward encoding-stage information directly to the decoder. Moreover, overlap inference is employed to alleviate boundary effects occurring when high resolution images are inferred from small-sized patches. Finally, we also propose a post-processing method based on weighted belief propagation to visually enhance the classification results. Extensive experiments based on the Vaihingen and Potsdam datasets demonstrate that the proposed architectures outperform three reference state-of-the-art network designs both numerically and visually.
引用
收藏
页数:24
相关论文
共 44 条
[31]   Modeling the shape of the scene: A holistic representation of the spatial envelope [J].
Oliva, A ;
Torralba, A .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2001, 42 (03) :145-175
[32]  
Paisitkriangkrai Sakrapee, 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), P36, DOI 10.1109/CVPRW.2015.7301381
[33]   Constructing the L2-Graph for Robust Subspace Learning and Subspace Clustering [J].
Peng, Xi ;
Yu, Zhiding ;
Yi, Zhang ;
Tang, Huajin .
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (04) :1053-1066
[34]  
Pinheiro PO, 2014, PR MACH LEARN RES, V32
[35]   CNN Features off-the-shelf: an Astounding Baseline for Recognition [J].
Razavian, Ali Sharif ;
Azizpour, Hossein ;
Sullivan, Josephine ;
Carlsson, Stefan .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2014, :512-519
[36]  
Rees W. G., 2013, Physical Principles of Remote Sensing
[37]  
Stehling R. O., 2002, Proceedings of the Eleventh International Conference on Information and Knowledge Management. CIKM 2002, P102, DOI 10.1145/584792.584812
[38]   Automatic Target Detection in High-Resolution Remote Sensing Images Using Spatial Sparse Coding Bag-of-Words Model [J].
Sun, Hao ;
Sun, Xian ;
Wang, Hongqi ;
Li, Yu ;
Li, Xiangjuan .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2012, 9 (01) :109-113
[39]  
Szegedy C., 2015, PROCIEEE CONFCOMPUT, P1, DOI 10.1109/CVPR.2015.7298594
[40]  
Tieleman T., 2012, COURSERA, Neural Netw. Mach. Learn., P26