Remote Sensing Image Road Segmentation Method Integrating CNN-Transformer and UNet

被引：2

作者：

Wang, Rui ^{[1
]}

Cai, Mingxiang ^{[1
]}

Xia, Zixuan ^{[2
]}

Zhou, Zhicui ^{[3
]}

机构：

[1] China Transport Telecommun & Informat Ctr, Beijing 100011, Peoples R China

[2] Heilongjiang Univ Technol, Harbin 150022, Heilongjiang, Peoples R China

[3] No 1 Middle Sch Weifang, Jixi 150022, Heilongjiang, Peoples R China

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Road segmentation; deep learning; CNN-transformer; attention; UNet; EXTRACTION;

D O I：

10.1109/ACCESS.2023.3344797

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Real-time and accurate road information is crucial for updating electronic navigation maps. To address the problem of low precision and poor robustness in current semantic segmentation methods for road extraction from remote sensing imagery, we proposed a UNet road semantic segmentation model based on attention mechanism improvement. First, we introduce a CNN-Transformer hybrid structure to the encoder to enhance the feature extraction capabilities of global and local details. Second, the traditional upsampling module in the decoder is replaced with a dual upsampling module to improve feature extraction capabilities and segmentation accuracy. Furthermore, the hard-swish activation function is used instead of ReLU activation function to smooth the curve, which helps to improve the generalization and non-linear feature extraction abilities and avoid gradient vanishing. Finally, a comprehensive loss function combining cross entropy and dice is used to strengthen the segmentation result constraints and further improve segmentation accuracy. Experimental validation is performed on the Ottawa Road Dataset and the Massachusetts Road Dataset. Experimental results show that compared with U-Net, PSPNet, DeepLab V3 and TransUNet networks, this algorithm is the best in terms of MIoU, MPA and F1 score. Among them, on the Ottawa road data set, the MPA of this algorithm reached 95.48%. On the Massachusetts road data set, MPA is 92.56%. This method shows good performance in road extraction.

引用

页码：144446 / 144455

页数：10

共 41 条

[21] Road tests of self-driving vehicles: Affective and cognitive pathways in acceptance formation
Liu, Peng
Xu, Zhigang
Zhao, Xiangmo
[J]. TRANSPORTATION RESEARCH PART A-POLICY AND PRACTICE, 2019, 124 : 354 - 369
[22] Long J., 2015, P IEEE C COMP VIS PA, P3431
[23] Object-based road extraction from satellite images using ant colony optimization
Maboudi, Mehdi
Amini, Jalal
Hahn, Michael
Saati, Mehdi
[J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2017, 38 (01) : 179 - 198
[24] Mnih V., 2013, Machine Learning for Aerial Image Labeling
[25] Ramachandran P, 2017, Arxiv, DOI arXiv:1710.05941
[26] Rongrong Liu, 2021, 2021 IEEE 4th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), P255, DOI 10.1109/IMCEC51613.2021.9482207
[27] U-Net: Convolutional Networks for Biomedical Image Segmentation
Ronneberger, Olaf
Fischer, Philipp
Brox, Thomas
[J]. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, PT III, 2015, 9351 : 234 - 241
[28] An Attention-Based Digraph Convolution Network Enabled Framework for Congestion Recognition in Three-Dimensional Road Networks
Shen, Guojiang
Han, Xiao
Chin, KwaiSang
Kong, Xiangjie
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (09) : 14413 - 14426
[29] Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network
Shi, Wenzhe
Caballero, Jose
Huszar, Ferenc
Totz, Johannes
Aitken, Andrew P.
Bishop, Rob
Rueckert, Daniel
Wang, Zehan
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1874 - 1883
[30] Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556

← 1 2 3 4 5 →