Position Encoding for 3D Lane Detection via Perspective Transformer

被引:0
|
作者
Zhang, Meng Li [1 ]
Wang, Ming Wei [1 ]
Deng, Yan Yang [1 ]
Lei, Xin Yu [1 ]
机构
[1] Shaanxi Univ Sci & Technol, Shaanxi Joint Lab Artificial Intelligence, Xian 710021, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Three-dimensional displays; Feature extraction; Lane detection; Encoding; Decoding; Task analysis; Convolution; Deep learning; Machine learning; 3D lane detection; position embedding; view conversion; autonomous vehicle; deep learning; machine learning;
D O I
10.1109/ACCESS.2024.3436561
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3D lane detection from the input monocular image is a basic but indispensable task in the environment perception of automatic driving. Recent work uses modules such as depth estimation, coordinate system transformation, and time series tracking to achieve the correspondence of 2D to 3D information. However, inaccurate depth information generated by perturbations during conversion poses a challenge to lane detection methods that rely only on monocular images. To solve the above problems, we propose a PELD model, a proxy transformation using BEV aerial view, to explicitly give 3D lane detection results. Specifically, when sampling feature information, feature flipping is proposed to supplement the global context information before view conversion, and the 3D position encoding information generated by the forward-looking features enhances the depth information. After the 3D position encoding information is combined with the feature information, the cross-attention module is used as a value for adaptive supervision of BEV queries. On the one hand, we use deformable attention to sample forward looking features and generate explicit lane representation; on the other hand, we supplement supervised lane line generation by supplementing forward looking features and enhancing 3D spatial information. PELD implements a more advanced approach than ever before on OpenLane and Apollo datasets.
引用
收藏
页码:106480 / 106487
页数:8
相关论文
共 50 条
  • [1] Anchor3DLane++: 3D Lane Detection via Sample-Adaptive Sparse 3D Anchor Regression
    Huang, Shaofei
    Shen, Zhenwei
    Huang, Zehao
    Liao, Yue
    Han, Jizhong
    Wang, Naiyan
    Liu, Si
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (03) : 1660 - 1673
  • [2] 3D Directional Encoding for Point Cloud Analysis
    Jung, Yoonjae
    Lee, Sang-Hyun
    Seo, Seung-Woo
    IEEE ACCESS, 2024, 12 : 144533 - 144543
  • [3] Learning on 3D Meshes With Laplacian Encoding and Pooling
    Qiao, Yi-Ling
    Gao, Lin
    Yang, Jie
    Rosin, Paul L.
    Lai, Yu-Kun
    Chen, Xilin
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2022, 28 (02) : 1317 - 1327
  • [4] Transformer for 3D Point Clouds
    Wang, Jiayun
    Chakraborty, Rudrasis
    Yu, Stella X.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (08) : 4419 - 4431
  • [5] Transformer3D-Det: Improving 3D Object Detection by Vote Refinement
    Zhao, Lichen
    Guo, Jinyang
    Xu, Dong
    Sheng, Lu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (12) : 4735 - 4746
  • [6] TRANSFERRING MODELS TRAINED ON NATURAL IMAGES TO 3D MRI VIA POSITION ENCODED SLICE MODELS
    Gupta, Umang
    Chattopadhyay, Tamoghna
    Dhinagar, Nikhil
    Thompson, Paul M.
    Steeg, Greg Ver
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [7] Anthropometric Landmark Detection Network via Geodesic Heatmap on 3D Human Scan
    Cha, Min Hee
    Park, Jae Hyeon
    Byun, Ji Sun
    Ahn, Sangyeon
    Lee, Gyoomin
    Yoon, Seung Hyun
    Cho, Sung In
    IEEE ACCESS, 2024, 12 : 197035 - 197047
  • [8] GroupLane: End-to-End 3D Lane Detection With Channel-Wise Grouping
    Li, Zhuoling
    Han, Chunrui
    Ge, Zheng
    Yang, Jinrong
    Yu, En
    Wang, Haoqian
    Zhang, Xiangyu
    Zhao, Hengshuang
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 10487 - 10494
  • [9] MonoPSTR: Monocular 3-D Object Detection With Dynamic Position and Scale-Aware Transformer
    Yang, Fan
    He, Xuan
    Chen, Wenrui
    Zhou, Pengjie
    Li, Zhiyong
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 1
  • [10] Group Multi-View Transformer for 3D Shape Analysis With Spatial Encoding
    Xu, Lixiang
    Cui, Qingzhe
    Hong, Richang
    Xu, Wei
    Chen, Enhong
    Yuan, Xin
    Li, Chenglong
    Tang, Yuanyan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9450 - 9463