Semantic segmentation of slums in satellite images using transfer learning on fully convolutional neural networks

被引:256
作者
Wurm, Michael [1 ]
Stark, Thomas [2 ]
Zhu, Xiao Xiang [2 ,3 ]
Weigand, Matthias [1 ,4 ]
Taubenboeck, Hannes [1 ]
机构
[1] German Aerosp Ctr DLR, German Remote Sensing Data Ctr DFD, D-82234 Oberpfaffenhofen, Germany
[2] TUM, Signal Proc Earth Observat SiPEO, D-80333 Munich, Germany
[3] German Aerosp Ctr DLR, Remote Sensing Technol Inst IMF, D-82234 Oberpfaffenhofen, Germany
[4] Univ Wurzburg, Dept Remote Sensing, D-97074 Wurzburg, Germany
基金
欧洲研究理事会;
关键词
Slums; FCN; Convolutional neural networks; Deep learning; Transfer learning; HIGH-RESOLUTION; INFORMAL SETTLEMENTS; CLASSIFICATION; SPACE;
D O I
10.1016/j.isprsjprs.2019.02.006
中图分类号
P9 [自然地理学];
学科分类号
0705 ; 070501 ;
摘要
Unprecedented urbanization in particular in countries of the global south result in informal urban development processes, especially in mega cities. With an estimated 1 billion slum dwellers globally, the United Nations have made the fight against poverty the number one sustainable development goal. To provide better infrastructure and thus a better life to slum dwellers, detailed information on the spatial location and size of slums is of crucial importance. In the past, remote sensing has proven to be an extremely valuable and effective tool for mapping slums. The nature of used mapping approaches by machine learning, however, made it necessary to invest a lot of effort in training the models. Recent advances in deep learning allow for transferring trained fully convolutional networks (FCN) from one data set to another. Thus, in our study we aim at analyzing transfer learning capabilities of FCNs to slum mapping in various satellite images. A model trained on very high resolution optical satellite imagery from QuickBird is transferred to Sentinel-2 and TerraSAR-X data. While free-of-charge Sentinel 2 data is widely available, its comparably lower resolution makes slum mapping a challenging task. TerraSAR-X data on the other hand, has a higher resolution and is considered a powerful data source for infra-urban structure analysis. Due to the different image characteristics of SAR compared to optical data, however, transferring the model could not improve the performance of semantic segmentation but we observe very high accuracies for mapped slums in the optical data: QuickBird image obtains 86-88% (positive prediction value and sensitivity) and a significant increase for Sentinel-2 applying transfer learning can be observed (from 38 to 55% and from 79 to 85% for PPV and sensitivity, respectively). Using transfer learning proofs extremely valuable in retrieving information on small-scaled urban structures such as slum patches even in satellite images of decametric resolution.
引用
收藏
页码:59 / 69
页数:11
相关论文
共 60 条
[1]  
[Anonymous], MILL MENSCH LEB SLUM
[2]  
[Anonymous], 2017, SUSTAINABLE DEV GOAL
[3]  
Aytar Y, 2011, IEEE I CONF COMP VIS, P2252, DOI 10.1109/ICCV.2011.6126504
[4]   Understanding heterogeneity in metropolitan India: The added value of remote sensing data for analyzing sub-standard residential areas [J].
Baud, Isa ;
Kuffer, Monika ;
Pfeffer, Karin ;
Sliuzas, Richard ;
Karuppannan, Sadasivam .
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2010, 12 (05) :359-374
[5]  
Burdett R., 2010, LIVING ENDLESS CITY, P8
[6]  
Castelluccio M., 2015, Acta Ecol. Sin.
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[9]   The similar size of slums [J].
Friesen, John ;
Taubenboeck, Hannes ;
Wurm, Michael ;
Pelz, Peter F. .
HABITAT INTERNATIONAL, 2018, 73 :79-88
[10]   Region-Based Convolutional Networks for Accurate Object Detection and Segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (01) :142-158