Semantic Segmentation on Remote Sensing Images with Multi-Scale Feature Fusion

被引:0
|
作者
Zhang J. [1 ]
Jin Q. [1 ,2 ]
Wang H. [2 ]
Da C. [2 ]
Xiang S. [2 ]
Pan C. [2 ]
机构
[1] School of Automation, Harbin University of Science and Technology, Harbin
[2] National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing
来源
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics | 2019年 / 31卷 / 09期
关键词
Deep convolutional neural network; Remote sensing images; Semantic segmentation;
D O I
10.3724/SP.J.1089.2019.17645
中图分类号
学科分类号
摘要
Semantic segmentation of remote sensing images has drawn extensive attention both from academy and industry for its wide range of applications, such as urban planning, urban change detection and geographic information system. Nevertheless, many complicated factors, such as complex background, shadows, objects with various scales, topological shapes and appearances in different regions, make this task quite challenging. Accordingly, this paper proposes a deep convolutional neural network model with multi-scale information fusion for semantic segmentation of remote sensing images. The structure of our model is composed of two parts: encoder and decoder. In the encoder part, a strategy is proposed to fuse multi-scale features based on DenseNet network. Specifically, global average pooling is first used to extract regional semantic information of different sub-regions to make network understand complex background in remote sensing images; sub-region global average pooling and multiscale convolution are then used to deal with complex background areas. In the decoder part, we design a shorter decoder which can fuse features from different levels of convolution to accurately restore the image details. For the overall model construction, we design a hierarchical monitoring mechanism with multiple outputs. This trick allows our model to obtain supervised information from different levels, which can help guide the training of the network. Extensive experiments on ISPRS benchmark datasets and Beijing remote sensing dataset demonstrate the effectiveness of our approach. © 2019, Beijing China Science Journal Publishing Co. Ltd. All right reserved.
引用
收藏
页码:1509 / 1517
页数:8
相关论文
共 26 条
  • [1] Long J., Shelhamer E., Darrell T., Fully convolutional networks for semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440, (2015)
  • [2] Krizhevsky A., Sutskever I., Hinton G.E., ImageNet classification with deep convolutional neural networks, Proceedings of the Advances in Neural Information Processing Systems, pp. 1097-1105, (2012)
  • [3] Liu Z.W., Li X.X., Luo P., Et al., Semantic image segmentation via deep parsing network, Proceedings of the IEEE International Conference on Computer Vision, pp. 1377-1385, (2015)
  • [4] Yang M.K., Yu K., Zhang C., Et al., DenseASPP for semantic segmentation in street scenes, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3684-3692, (2018)
  • [5] Lin G.S., Shen C.H., Van Den Hengel A., Et al., Efficient piecewise training of deep structured models for semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3194-3203, (2016)
  • [6] Dai J.F., He K.M., Sun J., Boxsup: exploiting bounding boxes to supervise convolutional networks for semantic segmentation, Proceedings of the IEEE International Conference on Computer Vision, pp. 1635-1643, (2015)
  • [7] Hong S., Noh H., Han B., Decoupled deep neural network for semi-supervised semantic segmentation, Proceedings of the Advances in Neural Information Processing Systems, pp. 1495-1503, (2015)
  • [8] He K.M., Gkioxari G., Dollar P., Et al., Mask R-CNN, Proceedings of the IEEE International Conference on Computer Vision, pp. 2980-2988, (2017)
  • [9] Ronneberger O., Fischer P., Brox T., U-net: convolutional networks for biomedical image segmentation, Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234-241, (2015)
  • [10] Badrinarayanan V., Kendall A., Cipolla R., Segnet: a deep convolutional encoder-decoder architecture for image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 12, pp. 2481-2495, (2016)