Position Encoding for 3D Lane Detection via Perspective Transformer

被引:0
作者
Zhang, Meng Li [1 ]
Wang, Ming Wei [1 ]
Deng, Yan Yang [1 ]
Lei, Xin Yu [1 ]
机构
[1] Shaanxi Univ Sci & Technol, Shaanxi Joint Lab Artificial Intelligence, Xian 710021, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Three-dimensional displays; Feature extraction; Lane detection; Encoding; Decoding; Task analysis; Convolution; Deep learning; Machine learning; 3D lane detection; position embedding; view conversion; autonomous vehicle; deep learning; machine learning;
D O I
10.1109/ACCESS.2024.3436561
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3D lane detection from the input monocular image is a basic but indispensable task in the environment perception of automatic driving. Recent work uses modules such as depth estimation, coordinate system transformation, and time series tracking to achieve the correspondence of 2D to 3D information. However, inaccurate depth information generated by perturbations during conversion poses a challenge to lane detection methods that rely only on monocular images. To solve the above problems, we propose a PELD model, a proxy transformation using BEV aerial view, to explicitly give 3D lane detection results. Specifically, when sampling feature information, feature flipping is proposed to supplement the global context information before view conversion, and the 3D position encoding information generated by the forward-looking features enhances the depth information. After the 3D position encoding information is combined with the feature information, the cross-attention module is used as a value for adaptive supervision of BEV queries. On the one hand, we use deformable attention to sample forward looking features and generate explicit lane representation; on the other hand, we supplement supervised lane line generation by supplementing forward looking features and enhancing 3D spatial information. PELD implements a more advanced approach than ever before on OpenLane and Apollo datasets.
引用
收藏
页码:106480 / 106487
页数:8
相关论文
共 50 条
  • [41] SOFW: A Synergistic Optimization Framework for Indoor 3D Object Detection
    Dai, Kun
    Jiang, Zhiqiang
    Xie, Tao
    Wang, Ke
    Liu, Dedong
    Fan, Zhendong
    Li, Ruifeng
    Zhao, Lijun
    Omar, Mohamed
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 637 - 651
  • [42] Relation Graph Network for 3D Object Detection in Point Clouds
    Feng, Mingtao
    Gilani, Syed Zulqarnain
    Wang, Yaonan
    Zhang, Liang
    Mian, Ajmal
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 92 - 107
  • [43] A Survey on 3D Object Detection Methods for Autonomous Driving Applications
    Arnold, Eduardo
    Al-Jarrah, Omar Y.
    Dianati, Mehrdad
    Fallah, Saber
    Oxtoby, David
    Mouzakitis, Alex
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2019, 20 (10) : 3782 - 3795
  • [44] QT-UNet: A Self-Supervised Self-Querying All-Transformer U-Net for 3D Segmentation
    Haversen, Andreas Hammer
    Bavirisetti, Durga Prasad
    Kiss, Gabriel Hanssen
    Lindseth, Frank
    IEEE ACCESS, 2024, 12 : 62664 - 62676
  • [45] MPCTrans: Multi-Perspective Cue-Aware Joint Relationship Representation for 3D Hand Pose Estimation via Swin Transformer
    Wan, Xiangan
    Ju, Jianping
    Tang, Jianying
    Lin, Mingyu
    Rao, Ning
    Chen, Deng
    Liu, Tingting
    Li, Jing
    Bian, Fan
    Xiong, Nicholas
    SENSORS, 2024, 24 (21)
  • [46] TPAFNet: Transformer-Driven Pyramid Attention Fusion Network for 3D Medical Image Segmentation
    Li, Zheng
    Zhang, Jinhui
    Wei, Siyi
    Gao, Yueyang
    Cao, Chengwei
    Wu, Zhiwei
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (11) : 6803 - 6814
  • [47] Aerial Monocular 3D Object Detection
    Hu, Yue
    Fang, Shaoheng
    Xie, Weidi
    Chen, Siheng
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (04) : 1959 - 1966
  • [48] Monocular 3D Object Detection via Geometric Reasoning on Keypoints
    Barabanau, Ivan
    Artemov, Alexey
    Burnaev, Evgeny
    Murashkin, Vyacheslav
    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP, 2020, : 652 - 659
  • [49] MonoAMNet: Three-Stage Real-Time Monocular 3D Object Detection With Adaptive Methods
    Pan, Huihui
    Jia, Yisong
    Wang, Jue
    Sun, Weichao
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025, 26 (03) : 3574 - 3587
  • [50] Transformer-Based Optimized Multimodal Fusion for 3D Object Detection in Autonomous Driving
    Alaba, Simegnew Yihunie
    Ball, John E.
    IEEE ACCESS, 2024, 12 : 50165 - 50176