Position Encoding for 3D Lane Detection via Perspective Transformer

被引：0

作者：

Zhang, Meng Li ^{[1
]}

Wang, Ming Wei ^{[1
]}

Deng, Yan Yang ^{[1
]}

Lei, Xin Yu ^{[1
]}

机构：

[1] Shaanxi Univ Sci & Technol, Shaanxi Joint Lab Artificial Intelligence, Xian 710021, Peoples R China

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Three-dimensional displays; Feature extraction; Lane detection; Encoding; Decoding; Task analysis; Convolution; Deep learning; Machine learning; 3D lane detection; position embedding; view conversion; autonomous vehicle; deep learning; machine learning;

D O I：

10.1109/ACCESS.2024.3436561

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

3D lane detection from the input monocular image is a basic but indispensable task in the environment perception of automatic driving. Recent work uses modules such as depth estimation, coordinate system transformation, and time series tracking to achieve the correspondence of 2D to 3D information. However, inaccurate depth information generated by perturbations during conversion poses a challenge to lane detection methods that rely only on monocular images. To solve the above problems, we propose a PELD model, a proxy transformation using BEV aerial view, to explicitly give 3D lane detection results. Specifically, when sampling feature information, feature flipping is proposed to supplement the global context information before view conversion, and the 3D position encoding information generated by the forward-looking features enhances the depth information. After the 3D position encoding information is combined with the feature information, the cross-attention module is used as a value for adaptive supervision of BEV queries. On the one hand, we use deformable attention to sample forward looking features and generate explicit lane representation; on the other hand, we supplement supervised lane line generation by supplementing forward looking features and enhancing 3D spatial information. PELD implements a more advanced approach than ever before on OpenLane and Apollo datasets.

引用

页码：106480 / 106487

页数：8

共 50 条

[1] Anchor3DLane++: 3D Lane Detection via Sample-Adaptive Sparse 3D Anchor Regression
Huang, Shaofei
Shen, Zhenwei
Huang, Zehao
Liao, Yue
Han, Jizhong
Wang, Naiyan
Liu, Si
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (03) : 1660 - 1673
[2] 3D Directional Encoding for Point Cloud Analysis
Jung, Yoonjae
Lee, Sang-Hyun
Seo, Seung-Woo
IEEE ACCESS, 2024, 12 : 144533 - 144543
[3] Learning on 3D Meshes With Laplacian Encoding and Pooling
Qiao, Yi-Ling
Gao, Lin
Yang, Jie
Rosin, Paul L.
Lai, Yu-Kun
Chen, Xilin
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2022, 28 (02) : 1317 - 1327
[4] Transformer for 3D Point Clouds
Wang, Jiayun
Chakraborty, Rudrasis
Yu, Stella X.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (08) : 4419 - 4431
[5] Transformer3D-Det: Improving 3D Object Detection by Vote Refinement
Zhao, Lichen
Guo, Jinyang
Xu, Dong
Sheng, Lu
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (12) : 4735 - 4746
[6] TRANSFERRING MODELS TRAINED ON NATURAL IMAGES TO 3D MRI VIA POSITION ENCODED SLICE MODELS
Gupta, Umang
Chattopadhyay, Tamoghna
Dhinagar, Nikhil
Thompson, Paul M.
Steeg, Greg Ver
2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
[7] Anthropometric Landmark Detection Network via Geodesic Heatmap on 3D Human Scan
Cha, Min Hee
Park, Jae Hyeon
Byun, Ji Sun
Ahn, Sangyeon
Lee, Gyoomin
Yoon, Seung Hyun
Cho, Sung In
IEEE ACCESS, 2024, 12 : 197035 - 197047
[8] GroupLane: End-to-End 3D Lane Detection With Channel-Wise Grouping
Li, Zhuoling
Han, Chunrui
Ge, Zheng
Yang, Jinrong
Yu, En
Wang, Haoqian
Zhang, Xiangyu
Zhao, Hengshuang
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 10487 - 10494
[9] MonoPSTR: Monocular 3-D Object Detection With Dynamic Position and Scale-Aware Transformer
Yang, Fan
He, Xuan
Chen, Wenrui
Zhou, Pengjie
Li, Zhiyong
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 1
[10] Group Multi-View Transformer for 3D Shape Analysis With Spatial Encoding
Xu, Lixiang
Cui, Qingzhe
Hong, Richang
Xu, Wei
Chen, Enhong
Yuan, Xin
Li, Chenglong
Tang, Yuanyan
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9450 - 9463

← 1 2 3 4 5 →