A multimodal fusion framework for urban scene understanding and functional identification using geospatial data

被引:16
作者
Su, Chen [1 ,2 ]
Hu, Xinli [1 ,2 ,3 ]
Meng, Qingyan [1 ,2 ,3 ]
Zhang, Linlin [1 ,2 ,3 ]
Shi, Wenxu [1 ,2 ]
Zhao, Maofan [1 ,2 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Hainan Aerosp Informat Res Inst, Key Lab Earth Observat Hainan Prov, Sanya 572029, Peoples R China
关键词
Urban scene understanding; Urban function; Multimodal data; Remote sensing; REMOTE; CLASSIFICATION;
D O I
10.1016/j.jag.2024.103696
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Urban scene understanding and functional identification are essential for accurately characterizing the spatial structure and optimizing the city layouts during rapid urbanization. Multimodal data is important for recognizing the distribution patterns of urban functions and revealing internal details. Previous studies have focused primarily on remote sensing imagery and points of interest (POIs) data, overlooking the role of building characteristics in determining functions of urban scenes. These studies are also limited in terms of mining and fusing multimodal features. To address these challenges, this study proposes a multimodal fusion framework that integrates remote sensing imagery, POIs, and building footprints for urban scene understanding and functional mapping. The framework employs a dual-branch model that extracts visual semantic features from the remote sensing imagery and socioeconomic features from auxiliary data, such as POIs and building footprints. A branch attention module is designed to assign weights to dual-branch features. Additionally, a multiscale feature fusion module is introduced to extract and combine multiscale features through modal interaction. Experiments in Beijing and Chengdu validate the effectiveness of the proposed framework with overall accuracy of 90.04% and 92.07%, and kappa coefficient of 0.881 and 0.895, respectively. This study provides empirical evidence to support accurate urban planning and further promote urban sustainable development. The source code is at: htt ps://github.com/sssuchen/MMFF.
引用
收藏
页数:16
相关论文
共 56 条
[1]   Mapping of functional areas in Spain based on mobile phone data during different phases of the COVID-19 pandemic [J].
Arjona, Joaquin Osorio ;
Santacruz, Javier Sebastian Ruiz ;
Samperiz, Julia de las Obras-Loscertales .
JOURNAL OF MAPS, 2023, 19 (01)
[2]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[3]   A novel unsupervised deep learning method for the generalization of urban form [J].
Cai, Jihong ;
Chen, Yimin .
GEO-SPATIAL INFORMATION SCIENCE, 2022, 25 (04) :568-587
[4]   Discovery of urban functional regions based on Node2vec [J].
Cai, Li ;
Zhang, Lanqiuyue ;
Liang, Yu ;
Li, Jin .
APPLIED INTELLIGENCE, 2022, 52 (14) :16886-16899
[5]   Deep learning-based remote and social sensing data fusion for urban region function recognition [J].
Cao, Rui ;
Tu, Wei ;
Yang, Cuixin ;
Li, Qing ;
Liu, Jun ;
Zhu, Jiasong ;
Zhang, Qian ;
Li, Qingquan ;
Qiu, Guoping .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2020, 163 :82-97
[6]   Multi-modal fusion of satellite and street-view images for urban village classification based on a dual-branch deep neural network [J].
Chen, Boan ;
Feng, Quanlong ;
Niu, Bowen ;
Yan, Fengqin ;
Gao, Bingbo ;
Yang, Jianyu ;
Gong, Jianhua ;
Liu, Jiantao .
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 109
[7]   Remote Sensing Scene Classification via Multi-Branch Local Attention Network [J].
Chen, Si-Bao ;
Wei, Qing-Song ;
Wang, Wen-Zhong ;
Tang, Jin ;
Luo, Bin ;
Wang, Zu-Yuan .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :99-109
[8]   When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNs [J].
Cheng, Gong ;
Yang, Ceyuan ;
Yao, Xiwen ;
Guo, Lei ;
Han, Junwei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (05) :2811-2821
[9]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[10]  
Dosovitskiy Alexey, 2021, ICLR