Bird's Eye View Semantic Segmentation based on Improved Transformer for Automatic Annotation

被引:4
作者
Liang, Tianjiao [1 ,2 ]
Pan, Weiguo [1 ,2 ]
Bao, Hong [1 ,2 ]
Fan, Xinyue [1 ,2 ]
Li, Han [1 ,2 ]
机构
[1] Beijing Union Univ, Beijing Key Lab Informat Serv Engn, Beijing 100101, Peoples R China
[2] Beijing Union Univ, Coll Robot, Beijing 100101, Peoples R China
来源
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS | 2023年 / 17卷 / 08期
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Transformer; Semantic Segmentation; High-Definition maps; Automatic Annotation; ROAD MARKING EXTRACTION; POINT; NETWORK;
D O I
10.3837/tiis.2023.08.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
High-definition (HD) maps can provide precise road information that enables an autonomous driving system to effectively navigate a vehicle. Recent research has focused on leveraging semantic segmentation to achieve automatic annotation of HD maps. However, the existing methods suffer from low recognition accuracy in automatic driving scenarios, leading to inefficient annotation processes. In this paper, we propose a novel semantic segmentation method for automatic HD map annotation. Our approach introduces a new encoder, known as the convolutional transformer hybrid encoder, to enhance the model's feature extraction capabilities. Additionally, we propose a multi-level fusion module that enables the model to aggregate different levels of detail and semantic information. Furthermore, we present a novel decoupled boundary joint decoder to improve the model's ability to handle the boundary between categories. To evaluate our method, we conducted experiments using the Bird's Eye View point cloud images dataset and Cityscapes dataset. Comparative analysis against state-of-the-art methods demonstrates that our model achieves the highest performance. Specifically, our model achieves an mIoU of 56.26%, surpassing the results of SegFormer with an mIoU of 1.47%. This innovative promises to significantly enhance the efficiency of HD map automatic annotation.
引用
收藏
页码:1996 / 2015
页数:20
相关论文
共 54 条
  • [1] Aerial LaneNet: Lane-Marking Semantic Segmentation in Aerial Imagery Using Wavelet-Enhanced Cost-Sensitive Symmetric Fully Convolutional Neural Networks
    Azimi, Seyed Majid
    Fischer, Peter
    Koerner, Marco
    Reinartz, Peter
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (05): : 2920 - 2938
  • [2] RoadTracer: Automatic Extraction of Road Networks from Aerial Images
    Bastani, Favyen
    He, Songtao
    Abbar, Sofiane
    Alizadeh, Mohammad
    Balakrishnan, Hari
    Chawla, Sanjay
    Madden, Sam
    DeWitt, David
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4720 - 4728
  • [3] Structured Bird's-Eye-View Traffic Scene Understanding from Onboard Images
    Can, Yigit Baran
    Liniger, Alexander
    Paudel, Danda Pani
    Van Gool, Luc
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15641 - 15650
  • [4] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [5] Chen J, 2021, ARXIV
  • [6] Chen LC., 2014, SEMANTIC IMAGE SEGME, DOI DOI 10.48550/ARXIV.1412.7062
  • [7] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [8] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [9] Chen LB, 2017, IEEE INT SYMP NANO, P1, DOI 10.1109/NANOARCH.2017.8053709
  • [10] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223