Remote Sensing Image Road Segmentation Method Integrating CNN-Transformer and UNet

被引:2
作者
Wang, Rui [1 ]
Cai, Mingxiang [1 ]
Xia, Zixuan [2 ]
Zhou, Zhicui [3 ]
机构
[1] China Transport Telecommun & Informat Ctr, Beijing 100011, Peoples R China
[2] Heilongjiang Univ Technol, Harbin 150022, Heilongjiang, Peoples R China
[3] No 1 Middle Sch Weifang, Jixi 150022, Heilongjiang, Peoples R China
关键词
Road segmentation; deep learning; CNN-transformer; attention; UNet; EXTRACTION;
D O I
10.1109/ACCESS.2023.3344797
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Real-time and accurate road information is crucial for updating electronic navigation maps. To address the problem of low precision and poor robustness in current semantic segmentation methods for road extraction from remote sensing imagery, we proposed a UNet road semantic segmentation model based on attention mechanism improvement. First, we introduce a CNN-Transformer hybrid structure to the encoder to enhance the feature extraction capabilities of global and local details. Second, the traditional upsampling module in the decoder is replaced with a dual upsampling module to improve feature extraction capabilities and segmentation accuracy. Furthermore, the hard-swish activation function is used instead of ReLU activation function to smooth the curve, which helps to improve the generalization and non-linear feature extraction abilities and avoid gradient vanishing. Finally, a comprehensive loss function combining cross entropy and dice is used to strengthen the segmentation result constraints and further improve segmentation accuracy. Experimental validation is performed on the Ottawa Road Dataset and the Massachusetts Road Dataset. Experimental results show that compared with U-Net, PSPNet, DeepLab V3 and TransUNet networks, this algorithm is the best in terms of MIoU, MPA and F1 score. Among them, on the Ottawa road data set, the MPA of this algorithm reached 95.48%. On the Massachusetts road data set, MPA is 92.56%. This method shows good performance in road extraction.
引用
收藏
页码:144446 / 144455
页数:10
相关论文
共 41 条
  • [21] Road tests of self-driving vehicles: Affective and cognitive pathways in acceptance formation
    Liu, Peng
    Xu, Zhigang
    Zhao, Xiangmo
    [J]. TRANSPORTATION RESEARCH PART A-POLICY AND PRACTICE, 2019, 124 : 354 - 369
  • [22] Long J., 2015, P IEEE C COMP VIS PA, P3431
  • [23] Object-based road extraction from satellite images using ant colony optimization
    Maboudi, Mehdi
    Amini, Jalal
    Hahn, Michael
    Saati, Mehdi
    [J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2017, 38 (01) : 179 - 198
  • [24] Mnih V., 2013, Machine Learning for Aerial Image Labeling
  • [25] Ramachandran P, 2017, Arxiv, DOI arXiv:1710.05941
  • [26] Rongrong Liu, 2021, 2021 IEEE 4th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), P255, DOI 10.1109/IMCEC51613.2021.9482207
  • [27] U-Net: Convolutional Networks for Biomedical Image Segmentation
    Ronneberger, Olaf
    Fischer, Philipp
    Brox, Thomas
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, PT III, 2015, 9351 : 234 - 241
  • [28] An Attention-Based Digraph Convolution Network Enabled Framework for Congestion Recognition in Three-Dimensional Road Networks
    Shen, Guojiang
    Han, Xiao
    Chin, KwaiSang
    Kong, Xiangjie
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (09) : 14413 - 14426
  • [29] Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network
    Shi, Wenzhe
    Caballero, Jose
    Huszar, Ferenc
    Totz, Johannes
    Aitken, Andrew P.
    Bishop, Rob
    Rueckert, Daniel
    Wang, Zehan
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1874 - 1883
  • [30] Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556