MFTSC: A Semantically Constrained Method for Urban Building Height Estimation Using Multiple Source Images

被引:10
作者
Chen, Yuhan [1 ,2 ]
Yan, Qingyun [1 ]
Huang, Weimin [3 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Remote Sensing & Geomatics Engn, Nanjing 210044, Peoples R China
[2] Harbin Engn Univ, Qingdao Innovat & Dev Base Ctr, Qingdao 266400, Peoples R China
[3] Mem Univ, Fac Engn & Appl Sci, St John, NF A1B 3X5, Canada
基金
中国国家自然科学基金;
关键词
height estimation; multi-task learning; Vision Transformer; remote sensing; synthetic aperture radar; DEPTH ESTIMATION; SAR;
D O I
10.3390/rs15235552
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The use of remote sensing imagery has significantly enhanced the efficiency of building extraction; however, the precise estimation of building height remains a formidable challenge. In light of ongoing advancements in computer vision, numerous techniques leveraging convolutional neural networks and Transformers have been applied to remote sensing imagery, yielding promising outcomes. Nevertheless, most existing approaches directly estimate height without considering the intrinsic relationship between semantic building segmentation and building height estimation. In this study, we present a unified architectural framework that integrates the tasks of building semantic segmentation and building height estimation. We introduce a Transformer model that systematically merges multi-level features with semantic constraints and leverages shallow spatial detail feature cues in the encoder. Our approach excels in both height estimation and semantic segmentation tasks. Specifically, the coefficient of determination (R2) in the height estimation task attains a remarkable 0.9671, with a root mean square error (RMSE) of 1.1733 m. The mean intersection over union (mIoU) for building semantic segmentation reaches 0.7855. These findings underscore the efficacy of multi-task learning by integrating semantic segmentation with height estimation, thereby enhancing the precision of height estimation.
引用
收藏
页数:22
相关论文
共 74 条
[1]   A spatio-temporal framework for sustainable planning of buildings based on carbon emissions at the city scale [J].
Abu Dabous, Saleh ;
Shanableh, Abdallah ;
Al-Ruzouq, Rami ;
Hosny, Fatma ;
Khalil, Mohamad Ali .
SUSTAINABLE CITIES AND SOCIETY, 2022, 82
[2]   Attention Attention Everywhere: Monocular Depth Prediction with Skip Attention [J].
Agarwal, Ashutosh ;
Arora, Chetan .
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, :5850-5859
[3]   DEPTHFORMER: MULTISCALE VISION TRANSFORMER FOR MONOCULAR DEPTH ESTIMATION WITH GLOBAL LOCAL INFORMATION FUSION [J].
Agarwal, Ashutosh ;
Arora, Chetan .
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, :3873-3877
[4]   Deep-Learning-Based Feature Extraction Approach for Significant Wave Height Prediction in SAR Mode Altimeter Data [J].
Atteia, Ghada ;
Collins, Michael J. ;
Algarni, Abeer D. ;
Samee, Nagwan Abdel .
REMOTE SENSING, 2022, 14 (21)
[5]   AdaBins: Depth Estimation Using Adaptive Bins [J].
Bhat, Shariq Farooq ;
Alhashim, Ibraheem ;
Wonka, Peter .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :4008-4017
[6]   MulT: An End-to-End Multitask Learning Transformer [J].
Bhattacharjee, Deblina ;
Zhang, Tong ;
Suesstrunk, Sabine ;
Salzmann, Mathieu .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :12021-12031
[7]   Building Height Retrieval From VHR SAR Imagery Based on an Iterative Simulation and Matching Technique [J].
Brunner, Dominik ;
Lemoine, Guido ;
Bruzzone, Lorenzo ;
Greidanus, Harm .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2010, 48 (03) :1487-1504
[8]  
Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9
[9]   BIM-GIS Integrated Utilization in Urban Disaster Management: The Contributions, Challenges, and Future Directions [J].
Cao, Yu ;
Xu, Cong ;
Aziz, Nur Mardhiyah ;
Kamaruzzaman, Syahrul Nizam .
REMOTE SENSING, 2023, 15 (05)
[10]   Multitask Learning of Height and Semantics From Aerial Images [J].
Carvalho, Marcela ;
Le Saux, Bertrand ;
Trouve-Peloux, Pauline ;
Champagnat, Frederic ;
Almansa, Andres .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (08) :1391-1395