Comparative analysis of deep learning based building extraction methods with the new VHR Istanbul dataset

被引:21
作者
Bakirman, Tolga [1 ]
Komurcu, Irem [2 ]
Sertel, Elif [3 ]
机构
[1] Yildiz Tech Univ, Geomat Engn, TR-34220 Istanbul, Turkey
[2] Deloitte Touche Tohmatsu Ltd Turkey, Istanbul, Turkey
[3] Istanbul Tech Univ, Geomat Engn, Istanbul, Turkey
关键词
Building extraction; Deep learning; Pleé iades; Urban; REMOTE-SENSING IMAGERY; NETWORK; CLASSIFICATION;
D O I
10.1016/j.eswa.2022.117346
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic building segmentation from satellite images is an important task for various applications such as urban mapping, disaster management and regional planning. With the broader availability of very highresolution satellite images, deep learning-based techniques have been broadly used for remote sensing imagerelated tasks. In this study, we generated a new building dataset, the Istanbul dataset, for the building segmentation task. 150 Ple ' iades image tiles of 1500 x 1500 pixels covering an area of 85 km2 area of Istanbul city were used and approximately 40,000 buildings were labelled, representing different building structures and spatial distribution. We extensively investigated the ideal architecture, encoder and hyperparameter settings for building segmentation tasks using the new Istanbul dataset. More than 60 experiments were conducted by applying state-of-the-art architectures such as U-Net, Unet++, DeepLabv3+, FPN and PSPNet with different pretrained encoders and hyperparameters. Our experiments showed that Unet++ architecture using SE-ResNeXt101 encoder pre-trained with ImageNet provides the best results with 93.8% IoU on the Istanbul dataset. In order to prove our solution's generalizability, the ideal network has also been trained separately on Inria and Massachusetts building segmentation datasets. The networks have produced IoU values of 75.39% and 92.53% on the Inria and Massachusetts datasets, respectively. The results indicate that our ideal network solution settings outperform other methods in terms of building segmentation even without any specific architectural modification. The weights files and inference notebook is available on: https://github.com/TolgaBkm/Istanbul_Dataset.
引用
收藏
页数:13
相关论文
共 75 条
[1]   Building Footprint Extraction from High Resolution Aerial Images Using Generative Adversarial Network (GAN) Architecture [J].
Abdollahi, Abolfazl ;
Pradhan, Biswajeet ;
Gite, Shilpa ;
Alamri, Abdullah .
IEEE ACCESS, 2020, 8 :209517-209527
[2]  
[Anonymous], 2018, REMOTE SENS BASEL, DOI DOI 10.3390/RS10030407
[3]  
[Anonymous], 2015, CVPR
[4]  
[Anonymous], 2022, J. Electron. Imag.
[5]   Multi-resolution, object-oriented fuzzy analysis of remote sensing data for GIS-ready information [J].
Benz, UC ;
Hofmann, P ;
Willhauck, G ;
Lingenfelder, I ;
Heynen, M .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2004, 58 (3-4) :239-258
[6]   Deep learning-based multi-feature semantic segmentation in building extraction from images of UAV photogrammetry [J].
Boonpook, Wuttichai ;
Tan, Yumin ;
Xu, Bo .
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2021, 42 (01) :1-19
[7]   Performance Improvement of Encoder/Decoder-Based CNN Architectures for Change Detection from Very High-Resolution Satellite Imagery [J].
Bousias Alexakis, Evangelos ;
Armenakis, Costas .
CANADIAN JOURNAL OF REMOTE SENSING, 2021, 47 (02) :309-336
[8]  
Chaurasia A, 2017, 2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP)
[9]  
Chen L.C., 2015, ICLR
[10]   Deep Cross-Modal Audio-Visual Generation [J].
Chen, Lele ;
Srivastava, Sudhanshu ;
Duan, Zhiyao ;
Xu, Chenliang .
PROCEEDINGS OF THE THEMATIC WORKSHOPS OF ACM MULTIMEDIA 2017 (THEMATIC WORKSHOPS'17), 2017, :349-357