2D Image-To-3D Model: Knowledge-Based 3D Building Reconstruction (3DBR) Using Single Aerial Images and Convolutional Neural Networks (CNNs)

被引:72
作者
Alidoost, Fatemeh [1 ]
Arefi, Hossein [1 ]
Tombari, Federico [2 ,3 ]
机构
[1] Univ Tehran, Coll Engn, Sch Surveying & Geospatial Engn, Tehran 1439957131, Iran
[2] Tech Univ Munich, Chair Comp Aided Med Procedures & Augmented Real, Fac Comp Sci, Boltzmannstr 3, D-85748 Garching, Germany
[3] Google Inc, CH-8002 Zurich, Switzerland
关键词
building reconstruction; deep learning; convolutional neural networks; building detection; depth prediction;
D O I
10.3390/rs11192219
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
In this study, a deep learning (DL)-based approach is proposed for the detection and reconstruction of buildings from a single aerial image. The pre-required knowledge to reconstruct the 3D shapes of buildings, including the height data as well as the linear elements of individual roofs, is derived from the RGB image using an optimized multi-scale convolutional-deconvolutional network (MSCDN). The proposed network is composed of two feature extraction levels to first predict the coarse features, and then automatically refine them. The predicted features include the normalized digital surface models (nDSMs) and linear elements of roofs in three classes of eave, ridge, and hip lines. Then, the prismatic models of buildings are generated by analyzing the eave lines. The parametric models of individual roofs are also reconstructed using the predicted ridge and hip lines. The experiments show that, even in the presence of noises in height values, the proposed method performs well on 3D reconstruction of buildings with different shapes and complexities. The average root mean square error (RMSE) and normalized median absolute deviation (NMAD) metrics are about 3.43 m and 1.13 m, respectively for the predicted nDSM. Moreover, the quality of the extracted linear elements is about 91.31% and 83.69% for the Potsdam and Zeebrugge test data, respectively. Unlike the state-of-the-art methods, the proposed approach does not need any additional or auxiliary data and employs a single image to reconstruct the 3D models of buildings with the competitive precision of about 1.2 m and 0.8 m for the horizontal and vertical RMSEs over the Potsdam data and about 3.9 m and 2.4 m over the Zeebrugge test data.
引用
收藏
页数:25
相关论文
共 44 条
[1]   Height estimation from single aerial images using a deep convolutional encoder-decoder network [J].
Amirkolaee, Hamed Amini ;
Arefi, Hossein .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2019, 149 :50-66
[2]  
[Anonymous], INT ARCH PHOTOGRA B2
[3]   Building Reconstruction Using DSM and Orthorectified Images [J].
Arefi, Hossein ;
Reinartz, Peter .
REMOTE SENSING, 2013, 5 (04) :1681-1703
[4]  
Avidan S., 2019, P IEEE C COMP VIS PA
[5]   An Effective Data-Driven Method for 3-D Building Roof Reconstruction and Robust Change Detection [J].
Awrangjeb, Mohammad ;
Gilani, Syed Ali Naqi ;
Siddiqui, Fasahat Ullah .
REMOTE SENSING, 2018, 10 (10)
[6]  
Axelsson P., 2000, The International Archives of the Photogrammetry and Remote Sensing, Amsterdam, The Netherlands, VXXXIII, P110
[7]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[8]   DSM-to-LoD2: Spaceborne Stereo Digital Surface Model Refinement [J].
Bittner, Ksenia ;
d'Angelo, Pablo ;
Koerner, Marco ;
Reinartz, Peter .
REMOTE SENSING, 2018, 10 (12)
[9]   3D Building Model Reconstruction from Multi-view Aerial Imagery and Lidar Data [J].
Cheng, Liang ;
Gong, Jianya ;
Li, Manchun ;
Liu, Yongxue .
PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING, 2011, 77 (02) :125-139
[10]   Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture [J].
Eigen, David ;
Fergus, Rob .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2650-2658