Position Encoding for 3D Lane Detection via Perspective Transformer

被引:0
|
作者
Zhang, Meng Li [1 ]
Wang, Ming Wei [1 ]
Deng, Yan Yang [1 ]
Lei, Xin Yu [1 ]
机构
[1] Shaanxi Univ Sci & Technol, Shaanxi Joint Lab Artificial Intelligence, Xian 710021, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Three-dimensional displays; Feature extraction; Lane detection; Encoding; Decoding; Task analysis; Convolution; Deep learning; Machine learning; 3D lane detection; position embedding; view conversion; autonomous vehicle; deep learning; machine learning;
D O I
10.1109/ACCESS.2024.3436561
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3D lane detection from the input monocular image is a basic but indispensable task in the environment perception of automatic driving. Recent work uses modules such as depth estimation, coordinate system transformation, and time series tracking to achieve the correspondence of 2D to 3D information. However, inaccurate depth information generated by perturbations during conversion poses a challenge to lane detection methods that rely only on monocular images. To solve the above problems, we propose a PELD model, a proxy transformation using BEV aerial view, to explicitly give 3D lane detection results. Specifically, when sampling feature information, feature flipping is proposed to supplement the global context information before view conversion, and the 3D position encoding information generated by the forward-looking features enhances the depth information. After the 3D position encoding information is combined with the feature information, the cross-attention module is used as a value for adaptive supervision of BEV queries. On the one hand, we use deformable attention to sample forward looking features and generate explicit lane representation; on the other hand, we supplement supervised lane line generation by supplementing forward looking features and enhancing 3D spatial information. PELD implements a more advanced approach than ever before on OpenLane and Apollo datasets.
引用
收藏
页码:106480 / 106487
页数:8
相关论文
共 50 条
  • [21] Video Visual Relation Detection via 3D Convolutional Neural Network
    Qu, Mingcheng
    Cui, Jianxun
    Su, Tonghua
    Deng, Ganlin
    Shao, Wenkai
    IEEE ACCESS, 2022, 10 : 23748 - 23756
  • [22] CaliFree3DLane: Calibration Free Spatio-Temporal BEV Representation for Monocular 3D Lane Detection
    Guo, Weizhi
    Li, Chaochao
    Li, Kaijiang
    Lv, Pei
    Xu, Mingliang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025, 26 (01) : 990 - 1001
  • [23] 3D detection transformer: Set prediction of objects using point clouds
    Thon, Tan
    Lim, Joanne Mun-Yee
    Jinn, Foo Ji
    Muniandy, Ramachandran
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 236
  • [24] BEV transformer for visual 3D object detection applied with retentive mechanism
    Pan, Jincheng
    Huang, Xiaoci
    Luo, Suyun
    Ma, Fang
    TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2025,
  • [25] Transformer-Based Global PointPillars 3D Object Detection Method
    Zhang, Lin
    Meng, Hua
    Yan, Yunbing
    Xu, Xiaowei
    ELECTRONICS, 2023, 12 (14)
  • [26] TransMRE: Multiple Observation Planes Representation Encoding With Fully Sparse Voxel Transformers for 3-D Object Detection
    Zhu, Ziming
    Zhu, Yu
    Zhang, Kezhi
    Li, Hangyu
    Ling, Xiaofeng
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [27] MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture
    Bai, Hao
    Li, Xiongwei
    Meng, Qing
    Zhuo, Shulong
    Yan, Lili
    IEEE ACCESS, 2025, 13 : 9462 - 9472
  • [28] 3D Vehicle Detection With RSU LiDAR for Autonomous Mine
    Wang, Guojun
    Wu, Jian
    Xu, Tong
    Tian, Bin
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (01) : 344 - 355
  • [29] Salient Object Detection in Stereoscopic 3D Images Using a Deep Convolutional Residual Autoencoder
    Zhou, Wujie
    Wu, Junwei
    Lei, Jingsheng
    Hwang, Jenq-Neng
    Yu, Lu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 (23) : 3388 - 3399
  • [30] SSD-MonoDETR: Supervised Scale-Aware Deformable Transformer for Monocular 3D Object Detection
    He, Xuan
    Yang, Fan
    Yang, Kailun
    Lin, Jiacheng
    Fu, Haolong
    Wang, Meng
    Yuan, Jin
    Li, Zhiyong
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 555 - 567