3D Object Detection with LiDAR Based on Multi-Attention Mechanism

被引：0

作者：

Cao, Jie ^{[1
]}

Peng, Yiqiang ^{[1
,2
,3
]}

Fan, Likang ^{[1
,2
,3
]}

Mo, Lingfan ^{[4
]}

Wang, Longfei ^{[1
]}

机构：

[1] Xihua Univ, Sch Automobile & Transportat, Chengdu 610039, Sichuan, Peoples R China

[2] Xihua Univ, Vehicle Measurement Control & Safety Key Lab Sichu, Chengdu 610039, Sichuan, Peoples R China

[3] Prov Engn Res Ctr New Energy Vehicle Intelligent C, Chengdu 610039, Sichuan, Peoples R China

[4] Guangdong Xinbao Elect Appliances Holdings Co Ltd, Foshan 528000, Guangdong, Peoples R China

来源：

LASER & OPTOELECTRONICS PROGRESS | 2025年 / 62卷 / 04期

关键词：

LiDAR; 3D object detection; channel attention; spatial attention; point cloud self-attention; VISION;

D O I：

10.3788/LOP241407

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

To address the issue of poor performance in detecting small objects by current 3D object detection algorithms based on the combination of point clouds and voxels, this paper proposes a 3D object detection algorithm based on a multi-attention mechanism (MA-RCNN). First, a channel attention mechanism is introduced in the PV-RCNN baseline algorithm to process the bird's-eye view features after compressing voxel features, aiming to propagate spatial information to feature channel levels. Second, a spatial attention mechanism is introduced to amplify locally important information, thereby enhancing the expressive power of the features. Then, in the refined candidate box network, a point cloud self-attention mechanism is designed to construct relationships between key points, thus enhancing the algorithm's understanding of spatial structures. Experimental results on the KITTI dataset show that compared to the baseline algorithm, MA-RCNN improves the mean average precision for small objects such as pedestrians and cyclists by 3.20 percentage points and 1.64 percentage points, respectively, demonstrating its effectiveness. Compared to current mainstream 3D object detection algorithms, MA-RCNN still achieves better detection performance, verifying its advanced nature. The MA-RCNN is deployed on the real vehicle hardware platform for online testing, and the results verify its industrial value.

引用

页数：10

共 26 条

[1] Chen YL, 2019, IEEE I CONF COMP VIS, P9774, DOI [10.1109/ICCV.2019.00987, 10.1109/iccv.2019.00987]
[2] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
Dai, Angela
Qi, Charles Ruizhongtai
Niessner, Matthias
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6545 - 6554
[3] Vision meets robotics: The KITTI dataset
Geiger, A.
Lenz, P.
Stiller, C.
Urtasun, R.
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) : 1231 - 1237
[4] Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
[5] EANTrack: An Efficient Attention Network for Visual Tracking
Gu, Fengwei
Lu, Jun
Cai, Chengtao
Zhu, Qidan
Ju, Zhaojie
[J]. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (04) : 5911 - 5928
[6] RPformer: A Robust Parallel Transformer for Visual Tracking in Complex Scenes
Gu, Fengwei
Lu, Jun
Cai, Chengtao
[J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
[7] He CH, 2020, PROC CVPR IEEE, P11870, DOI 10.1109/CVPR42600.2020.01189
[8] 3D Object Detection Based on Deep Semantics and Position Information Fusion of Laser Point Cloud
Hu Jie
An Yongpeng
Xu Wencai
Xiong Zongquan
Liu Han
[J]. CHINESE JOURNAL OF LASERS-ZHONGGUO JIGUANG, 2023, 50 (10):
[9] Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/TPAMI.2019.2913372, 10.1109/CVPR.2018.00745]
[10] DETR with Improved DeNoising Training for Multi-Scale Oriented Object Detection in Optical Remote Sensing Images (Invited)
Jin Ruijiao
Wang Kun
Liu Minhao
Teng Xichao
Li Zhang
Yu Qifeng
[J]. LASER & OPTOELECTRONICS PROGRESS, 2024, 61 (02)

← 1 2 3 →