Multi-modal fusion of satellite and street-view images for urban village classification based on a dual-branch deep neural network

被引:66
作者
Chen, Boan [1 ]
Feng, Quanlong [1 ,2 ]
Niu, Bowen [1 ]
Yan, Fengqin [2 ]
Gao, Bingbo [1 ]
Yang, Jianyu [1 ]
Gong, Jianhua [3 ]
Liu, Jiantao [4 ]
机构
[1] China Agr Univ, Coll Land Sci & Technol, Beijing 100193, Peoples R China
[2] Chinese Acad Sci, Inst Geog Sci & Nat Resources Res, State Key Lab Resources & Environm Informat Syst, Beijing 100101, Peoples R China
[3] Chinese Acad Sci, Aerosp Informat Res Inst, Natl Engn Res Ctr Geoinformat, Beijing 100101, Peoples R China
[4] Shandong Jianzhu Univ, Sch Surveying & Geoinformat, Jinan 250101, Shandong, Peoples R China
关键词
Remote sensing; Street-view; Deep learning; Urban village; SEMANTIC SEGMENTATION; INFORMAL SETTLEMENTS; SLUMS;
D O I
10.1016/j.jag.2022.102794
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
With the rapid urbanization process in China, numerous urban villages have been appeared, which are surrounded by the newly-built urban blocks. Due to the high population density, poor hygiene, chaotic waste discharge, and inadequate public facilities, urban villages have many negative impacts on both urban environment and management. The objective of this study is to propose a dual-branch deep learning model for multi modal satellite and street-view data fusion to detect urban villages in Beijing, Tianjin and Shijiazhuang, which are the core cities of Jing-Jin-Ji region of China. Specifically, the proposed model consists of a satellite branch, a street-view branch and a gated-fusion module. As for the satellite branch, a Trans-MDCNN (multi-scale dilated convolutional neural network) is proposed to learn multi-level local features and global contextual features from high resolution satellite imagery, while for the street-view branch, an MVRAN (multi-view recurrent attention network) is constructed to learn and fuse multi-angle features from street-view images. A gated-fusion module is designed to aggregate the important features of the dual-branches. Experimental results indicate that the proposed model has achieved good performance with an overall accuracy (OA) of 92.61%. Ablation study shows that compared with satellite data alone, the integration of street-view images could increase the OA by about 2%. Besides, 1-D feature fusion outperforms its 2-D counterpart and the classic feature concatenation method. The proposed model also yields a better performance than other deep learning models. Finally, the dataset of this study, (SUV)-U-2 (Satellite & Street-view images for Urban Village classification), is publicly available: https://doi.org/10.11922/sciencedb.01410.
引用
收藏
页数:15
相关论文
共 48 条
[1]  
Arimah B.C., 2010, FACE URBAN POVERTY E
[2]   Integrating Remote Sensing and Street View Images to Quantify Urban Forest Ecosystem Services [J].
Barbierato, Elena ;
Bernetti, Iacopo ;
Capecchi, Irene ;
Saragosa, Claudio .
REMOTE SENSING, 2020, 12 (02)
[3]   Integrating Aerial and Street View Images for Urban Land Use Classification [J].
Cao, Rui ;
Zhu, Jiasong ;
Tu, Wei ;
Li, Qingquan ;
Cao, Jinzhou ;
Liu, Bozhi ;
Zhang, Qian ;
Qiu, Guoping .
REMOTE SENSING, 2018, 10 (10)
[4]   A hierarchical approach for fine-grained urban villages recognition fusing remote and social sensing data [J].
Chen, Dongsheng ;
Tu, Wei ;
Cao, Rui ;
Zhang, Yatao ;
He, Biao ;
Wang, Chisheng ;
Shi, Tiezhu ;
Li, Qingquan .
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 106
[5]   Quantifying the green view indicator for assessing urban greening quality: An analysis based on Internet-crawling street view data [J].
Chen, Jinjin ;
Zhou, Chuanbin ;
Li, Feng .
ECOLOGICAL INDICATORS, 2020, 113
[6]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[7]  
COX DR, 1958, J R STAT SOC B, V20, P215
[8]  
Dosovitskiy A., 2020, INT C LEARN REPR
[9]   Mapping of plastic greenhouses and mulching films from very high resolution remote sensing imagery based on a dilated and non-local convolutional neural network [J].
Feng, Quanlong ;
Niu, Bowen ;
Chen, Boan ;
Ren, Yan ;
Zhu, Dehai ;
Yang, Jianyu ;
Liu, Jiantao ;
Ou, Cong ;
Li, Baoguo .
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2021, 102
[10]   Integrating Multitemporal Sentinel-1/2 Data for Coastal Land Cover Classification Using a Multibranch Convolutional Neural Network: A Case of the Yellow River Delta [J].
Feng, Quanlong ;
Yang, Jianyu ;
Zhu, Dehai ;
Liu, Jiantao ;
Guo, Hao ;
Bayartungalag, Batsaikhan ;
Li, Baoguo .
REMOTE SENSING, 2019, 11 (09)