A Deep Cross-Modal Fusion Network for Road Extraction With High-Resolution Imagery and LiDAR Data

被引:11
作者
Luo, Hui [1 ]
Wang, Zijing [1 ]
Du, Bo [2 ,3 ]
Dong, Yanni [4 ]
机构
[1] China Univ Geosci, Sch Comp Sci, Wuhan 430079, Peoples R China
[2] Wuhan Univ, Inst Artificial Intelligence, Natl Engn Res Ctr Multimedia Software, Sch Comp Sci, Wuhan 430072, Peoples R China
[3] Wuhan Univ, Hubei Key Lab Multimedia & Network Commun Engn, Wuhan 430072, Peoples R China
[4] Wuhan Univ, Sch Resource & Environm Sci, Wuhan 430079, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
关键词
Convolutional neural network; cross-modal feature fusion (CMFF); high-resolution remote sensing image; LiDAR data; road extraction; SEMANTIC SEGMENTATION; INFORMATION; MULTISCALE;
D O I
10.1109/TGRS.2024.3360963
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Urban road extraction is important for the applications of urban planning and transportation. High-resolution image (HRI) has been one of the most popular data sources for extracting roads with high efficiency and low cost. However, roads in HRI are easily obscured by buildings, trees, and other landscapes, resulting in discontinuity of the extracted roads. While current road extraction techniques by multimodal data fusion have shown improved results compared to single-modal methods by incorporating additional information, most existing fusion methods fail to fully exploit the features from different modalities and consider prior knowledge of roads. To address the above problems, a dual encoder-based cross-modal complementary fusion network (DECCFNet) is proposed in this article. The proposed network takes full advantage of the rich feature information contained in HRI and the immunity of LiDAR data to the influence of shadows. By effectively fusing the complementary information from HRI and LiDAR data, DECCFNet, respectively, achieved an improvement by at least 2.94% and 2.8% in IOU compared to those only using a single data modality on the two datasets. The proposed DECCFNet mainly contains two modules: 1) cross-modal feature fusion (CMFF) module: in the dual encoder part, CMFF is employed to fuse the deep features of different modalities from the channel and spatial dimension, while a multiscale fusion strategy is utilized to extract the contextual information; 2) multi-direction strip convolution (MDSC) module: since roads have the characteristics of narrowness and continuity, adopting classical convolution kernels directly on road features may introduce irrelevant pixels into the computation, blurring the extraction results. To mitigate this issue, MDSC is applied to strip the convolution of road features from multiple directions based on square convolution and make the network focus more on the specific road features. By comparing several deep-learning multimodal data fusion networks in the two road datasets, the proposed network exhibits the best road extraction results.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 62 条
[51]   GapLoss: A Loss Function for Semantic Segmentation of Roads in Remote Sensing Images [J].
Yuan, Wei ;
Xu, Wenbo .
REMOTE SENSING, 2022, 14 (10)
[52]   A review of deep learning methods for semantic segmentation of remote sensing imagery [J].
Yuan, Xiaohui ;
Shi, Jianfang ;
Gu, Lichuan .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 169
[53]   A Low-Grade Road Extraction Method Using SDG-DenseNet Based on the Fusion of Optical and SAR Images at Decision Level [J].
Zhang, Jinglin ;
Li, Yuxia ;
Si, Yu ;
Peng, Bo ;
Xiao, Fanghong ;
Luo, Shiyu ;
He, Lei .
REMOTE SENSING, 2022, 14 (12)
[54]   A Hybrid Attention-Aware Fusion Network (HAFNet) for Building Extraction from High-Resolution Imagery and LiDAR Data [J].
Zhang, Peng ;
Du, Peijun ;
Lin, Cong ;
Wang, Xin ;
Li, Erzhu ;
Xue, Zhaohui ;
Bai, Xuyu .
REMOTE SENSING, 2020, 12 (22) :1-20
[55]   SERNet: Squeeze and Excitation Residual Network for Semantic Segmentation of High-Resolution Remote Sensing Images [J].
Zhang, Xiaoyan ;
Li, Linhui ;
Di, Donglin ;
Wang, Jian ;
Chen, Guangsheng ;
Jing, Weipeng ;
Emam, Mahmoud .
REMOTE SENSING, 2022, 14 (19)
[56]   Road Extraction by Deep Residual U-Net [J].
Zhang, Zhengxin ;
Liu, Qingjie ;
Wang, Yunhong .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2018, 15 (05) :749-753
[57]   Road Centerline Extraction from Very-High-Resolution Aerial Image and LiDAR Data Based on Road Connectivity [J].
Zhang, Zhiqiang ;
Zhang, Xinchang ;
Sun, Ying ;
Zhang, Pengcheng .
REMOTE SENSING, 2018, 10 (08)
[58]   Multi-source collaborative enhanced for remote sensing images semantic segmentation [J].
Zhao, Jiaqi ;
Zhang, Di ;
Shi, Boyu ;
Zhou, Yong ;
Chen, Jingyang ;
Yao, Rui ;
Xue, Yong .
NEUROCOMPUTING, 2022, 493 :76-90
[59]   A Gather-to-Guide Network for Remote Sensing Semantic Segmentation of RGB and Auxiliary Image [J].
Zheng, Xianwei ;
Wu, Xiujie ;
Huan, Linxi ;
He, Wei ;
Zhang, Hongyan .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[60]   Spatial-Spectral-Emissivity Land-Cover Classification Fusing Visible and Thermal Infrared Hyperspectral Imagery [J].
Zhong, Yanfei ;
Jia, Tianyi ;
Zhao, Ji ;
Wang, Xinyu ;
Jin, Shuying .
REMOTE SENSING, 2017, 9 (09)