DenseU-Net-Based Semantic Segmentation of Objects in Urban Remote Sensing Images

被引:126
作者
Dong, Rongsheng [1 ]
Pan, Xiaoquan [1 ]
Li, Fengying [1 ]
机构
[1] Guilin Univ Elect Technol, Guangxi Key Lab Trusted Software, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Class imbalance; deep convolutional neural networks; median frequency balancing; semantic segmentation; urban remote sensing images;
D O I
10.1109/ACCESS.2019.2917952
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Class imbalance is a serious problem that plagues the semantic segmentation task in urban remote sensing images. Since large object classes dominate the segmentation task, small object classes are usually suppressed, so the solutions based on optimizing the overall accuracy are often unsatisfactory. In the light of the class imbalance of the semantic segmentation in urban remote sensing images, we developed the concept of the Down-sampling Block (DownBlock) for obtaining context information and the Up-sampling Block (UpBlock) for restoring the original resolution. We proposed an end-to-end deep convolutional neural network (DenseU-Net) architecture for pixel-wise urban remote sensing image segmentation. The main idea of the DenseU-Net is to connect convolutional neural network features through cascade operations and use its symmetrical structure to fuse the detail features in shallow layers and the abstract semantic features in deep layers. A focal loss function weighted by the median frequency balancing (MFB_Focal(loss)) is proposed; the accuracy of the small object classes and the overall accuracy are improved effectively with our approach. Our experiments were based on the 2016 ISPRS Vaihingen 2D semantic labeling dataset and demonstrated the following outcomes. In the case where boundary pixels were considered (GT), MFB_Focal(loss) achieved a good overall segmentation performance using the same U-Net model, and the F1-score of the small object class "car" was improved by 9.28% compared with the cross-entropy loss function. Using the same MFB_Focal(loss) loss function, the overall accuracy of the DenseU-Net was better than that of U-Net, where the F1-score of the "car" class was 6.71% higher. Finally, without any post-processing, the DenseU-Net+MFB_Focal(loss) achieved the overall accuracy of 85.63%, and the F1-score of the "car" class was 83.23%, which is superior to HSN+OI+WBP both numerically and visually.
引用
收藏
页码:65347 / 65356
页数:10
相关论文
共 24 条
[1]  
[Anonymous], P 3 INT C LEARNING R
[2]  
[Anonymous], IEEE T PATTERN ANAL
[3]  
[Anonymous], 2017, COMMUN ACM, DOI DOI 10.1145/3065386
[4]  
[Anonymous], 2015, PROC CVPR IEEE
[5]  
[Anonymous], 2013, Ph.D. dissertation
[6]   Aggregating Deep Convolutional Features for Image Retrieval Using Multi-regional Cross Weighting [J].
Dong, Rongsheng ;
Cheng, Deqiang ;
Li, Fengying .
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2018, 30 (04) :658-665
[7]   Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture [J].
Eigen, David ;
Fergus, Rob .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2650-2658
[8]   A multiple resampling method for learning from imbalanced data sets [J].
Estabrooks, A ;
Jo, TH ;
Japkowicz, N .
COMPUTATIONAL INTELLIGENCE, 2004, 20 (01) :18-36
[9]   An End-to-End Neural Network for Road Extraction From Remote Sensing Imagery by Multiple Feature Pyramid Network [J].
Gao, Xun ;
Sun, Xian ;
Zhang, Yi ;
Yan, Menglong ;
Xu, Guangluan ;
Sun, Hao ;
Jiao, Jiao ;
Fu, Kun .
IEEE ACCESS, 2018, 6 :39401-39414
[10]   A review of remote sensing image fusion methods [J].
Ghassemian, Hassan .
INFORMATION FUSION, 2016, 32 :75-89