Position Encoding for 3D Lane Detection via Perspective Transformer

被引:0
作者
Zhang, Meng Li [1 ]
Wang, Ming Wei [1 ]
Deng, Yan Yang [1 ]
Lei, Xin Yu [1 ]
机构
[1] Shaanxi Univ Sci & Technol, Shaanxi Joint Lab Artificial Intelligence, Xian 710021, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Three-dimensional displays; Feature extraction; Lane detection; Encoding; Decoding; Task analysis; Convolution; Deep learning; Machine learning; 3D lane detection; position embedding; view conversion; autonomous vehicle; deep learning; machine learning;
D O I
10.1109/ACCESS.2024.3436561
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3D lane detection from the input monocular image is a basic but indispensable task in the environment perception of automatic driving. Recent work uses modules such as depth estimation, coordinate system transformation, and time series tracking to achieve the correspondence of 2D to 3D information. However, inaccurate depth information generated by perturbations during conversion poses a challenge to lane detection methods that rely only on monocular images. To solve the above problems, we propose a PELD model, a proxy transformation using BEV aerial view, to explicitly give 3D lane detection results. Specifically, when sampling feature information, feature flipping is proposed to supplement the global context information before view conversion, and the 3D position encoding information generated by the forward-looking features enhances the depth information. After the 3D position encoding information is combined with the feature information, the cross-attention module is used as a value for adaptive supervision of BEV queries. On the one hand, we use deformable attention to sample forward looking features and generate explicit lane representation; on the other hand, we supplement supervised lane line generation by supplementing forward looking features and enhancing 3D spatial information. PELD implements a more advanced approach than ever before on OpenLane and Apollo datasets.
引用
收藏
页码:106480 / 106487
页数:8
相关论文
共 50 条
  • [31] SSD-MonoDETR: Supervised Scale-Aware Deformable Transformer for Monocular 3D Object Detection
    He, Xuan
    Yang, Fan
    Yang, Kailun
    Lin, Jiacheng
    Fu, Haolong
    Wang, Meng
    Yuan, Jin
    Li, Zhiyong
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 555 - 567
  • [32] TransRPPG: Remote Photoplethysmography Transformer for 3D Mask Face Presentation Attack Detection
    Yu, Zitong
    Li, Xiaobai
    Wang, Pichao
    Zhao, Guoying
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1290 - 1294
  • [33] MaskLRF: Self-Supervised Pretraining via Masked Autoencoding of Local Reference Frames for Rotation-Invariant 3D Point Set Analysis
    Furuya, Takahiko
    IEEE ACCESS, 2024, 12 : 73340 - 73353
  • [34] AnchorPoint: Query Design for Transformer-Based 3D Object Detection and Tracking
    Liu, Hao
    Ma, Yanni
    Wang, Hanyun
    Zhang, Chaobo
    Guo, Yulan
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (10) : 10988 - 11000
  • [35] Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection
    Kim, Taeho
    Kim, Joohee
    SENSORS, 2023, 23 (16)
  • [36] Bilateral transformer 3D planar recovery
    Ren, Fei
    Liao, Chunhua
    Xie, Zhina
    GRAPHICAL MODELS, 2024, 134
  • [37] 3D-CNN-SPP: A Patient Risk Prediction System From Electronic Health Records via 3D CNN and Spatial Pyramid Pooling
    Ju, Ronghui
    Zhou, Pan
    Wen, Shiping
    Wei, Wei
    Xue, Yuan
    Huang, Xiaolei
    Yang, Xin
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2021, 5 (02): : 247 - 261
  • [38] Spectrum Prediction With Deep 3D Pyramid Vision Transformer Learning
    Pan, Guangliang
    Wu, Qihui
    Zhou, Bo
    Li, Jie
    Wang, Wei
    Ding, Guoru
    Yau, David K. Y.
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2025, 24 (01) : 509 - 525
  • [39] Dual-Path Transformer for 3D Human Pose Estimation
    Zhou, Lu
    Chen, Yingying
    Wang, Jinqiao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3260 - 3270
  • [40] Graph Transformer for 3D point clouds classification and semantic segmentation
    Zhou, Wei
    Wang, Qian
    Jin, Weiwei
    Shi, Xinzhe
    He, Ying
    COMPUTERS & GRAPHICS-UK, 2024, 124