3D Object Detection with LiDAR Based on Multi-Attention Mechanism

被引:0
|
作者
Cao, Jie [1 ]
Peng, Yiqiang [1 ,2 ,3 ]
Fan, Likang [1 ,2 ,3 ]
Mo, Lingfan [4 ]
Wang, Longfei [1 ]
机构
[1] Xihua Univ, Sch Automobile & Transportat, Chengdu 610039, Sichuan, Peoples R China
[2] Xihua Univ, Vehicle Measurement Control & Safety Key Lab Sichu, Chengdu 610039, Sichuan, Peoples R China
[3] Prov Engn Res Ctr New Energy Vehicle Intelligent C, Chengdu 610039, Sichuan, Peoples R China
[4] Guangdong Xinbao Elect Appliances Holdings Co Ltd, Foshan 528000, Guangdong, Peoples R China
关键词
LiDAR; 3D object detection; channel attention; spatial attention; point cloud self-attention; VISION;
D O I
10.3788/LOP241407
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
To address the issue of poor performance in detecting small objects by current 3D object detection algorithms based on the combination of point clouds and voxels, this paper proposes a 3D object detection algorithm based on a multi-attention mechanism (MA-RCNN). First, a channel attention mechanism is introduced in the PV-RCNN baseline algorithm to process the bird's-eye view features after compressing voxel features, aiming to propagate spatial information to feature channel levels. Second, a spatial attention mechanism is introduced to amplify locally important information, thereby enhancing the expressive power of the features. Then, in the refined candidate box network, a point cloud self-attention mechanism is designed to construct relationships between key points, thus enhancing the algorithm's understanding of spatial structures. Experimental results on the KITTI dataset show that compared to the baseline algorithm, MA-RCNN improves the mean average precision for small objects such as pedestrians and cyclists by 3.20 percentage points and 1.64 percentage points, respectively, demonstrating its effectiveness. Compared to current mainstream 3D object detection algorithms, MA-RCNN still achieves better detection performance, verifying its advanced nature. The MA-RCNN is deployed on the real vehicle hardware platform for online testing, and the results verify its industrial value.
引用
收藏
页数:10
相关论文
共 26 条
  • [1] Chen YL, 2019, IEEE I CONF COMP VIS, P9774, DOI [10.1109/ICCV.2019.00987, 10.1109/iccv.2019.00987]
  • [2] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
    Dai, Angela
    Qi, Charles Ruizhongtai
    Niessner, Matthias
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6545 - 6554
  • [3] Vision meets robotics: The KITTI dataset
    Geiger, A.
    Lenz, P.
    Stiller, C.
    Urtasun, R.
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) : 1231 - 1237
  • [4] Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
  • [5] EANTrack: An Efficient Attention Network for Visual Tracking
    Gu, Fengwei
    Lu, Jun
    Cai, Chengtao
    Zhu, Qidan
    Ju, Zhaojie
    [J]. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (04) : 5911 - 5928
  • [6] RPformer: A Robust Parallel Transformer for Visual Tracking in Complex Scenes
    Gu, Fengwei
    Lu, Jun
    Cai, Chengtao
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [7] He CH, 2020, PROC CVPR IEEE, P11870, DOI 10.1109/CVPR42600.2020.01189
  • [8] 3D Object Detection Based on Deep Semantics and Position Information Fusion of Laser Point Cloud
    Hu Jie
    An Yongpeng
    Xu Wencai
    Xiong Zongquan
    Liu Han
    [J]. CHINESE JOURNAL OF LASERS-ZHONGGUO JIGUANG, 2023, 50 (10):
  • [9] Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/TPAMI.2019.2913372, 10.1109/CVPR.2018.00745]
  • [10] DETR with Improved DeNoising Training for Multi-Scale Oriented Object Detection in Optical Remote Sensing Images (Invited)
    Jin Ruijiao
    Wang Kun
    Liu Minhao
    Teng Xichao
    Li Zhang
    Yu Qifeng
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2024, 61 (02)