HybridPillars: Hybrid Point-Pillar Network for Real-Time Two-Stage 3-D Object Detection

被引:0
作者
Huang, Zhicong [1 ]
Huang, Yuxiao [1 ]
Zheng, Zhijie [1 ]
Hu, Haifeng [1 ]
Chen, Dihu [2 ]
机构
[1] Sun Yat Sen Univ, Sch Elect & Informat Technol, Guangzhou 510006, Peoples R China
[2] Sun Yat Sen Univ, Sch Integrated Circuits, Shenzhen 518000, Peoples R China
关键词
Three-dimensional displays; Feature extraction; Proposals; Point cloud compression; Object detection; Convolution; Accuracy; Representation learning; Real-time systems; Pipelines; 3-D object detection; LiDAR point clouds; real time; two-stage;
D O I
10.1109/JSEN.2024.3468646
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
LiDAR-based 3-D object detection is an important perceptual task in various fields such as intelligent transportation, autonomous driving, and robotics. Existing two-stage point-voxel methods contribute to the boost of accuracy on 3-D object detection by utilizing precise pointwise features to refine 3-D proposals. Although obtaining promising results, these methods are not suitable for real-time applications. First, the inference speed of existing point-voxel hybrid frameworks is slow because the acquisition of point features from voxel features consumes a lot of time. Second, existing point-voxel methods rely on 3-D convolution for voxel feature learning, which increases the difficulty of deployment on embedded computing platforms. To address these issues, we propose a real-time two-stage detection network, named HybridPillars. We first propose a novel hybrid framework by integrating a point feature encoder into a point-pillar pipeline efficiently. By combining point-based and pillar-based networks, our method can discard 3-D convolution to reduce computational complexity. Furthermore, we propose a novel pillar feature aggregation network to efficiently extract bird's eye view (BEV) features from pointwise features, thereby significantly enhancing the performance of our network. Extensive experiments demonstrate that our proposed HybridPillars not only boosts the inference speed, but also achieves competitive detection performance compared to other methods. The code will be available at https://github.com/huangzhicong3/HybridPillars.
引用
收藏
页码:38318 / 38328
页数:11
相关论文
共 48 条
  • [1] A Survey on 3D Object Detection Methods for Autonomous Driving Applications
    Arnold, Eduardo
    Al-Jarrah, Omar Y.
    Dianati, Mehrdad
    Fallah, Saber
    Oxtoby, David
    Mouzakitis, Alex
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2019, 20 (10) : 3782 - 3795
  • [2] Chen C, 2022, AAAI CONF ARTIF INTE, P221
  • [3] Chen YL, 2019, IEEE I CONF COMP VIS, P9774, DOI [10.1109/ICCV.2019.00987, 10.1109/iccv.2019.00987]
  • [4] VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking
    Chen, Yukang
    Liu, Jianhui
    Zhang, Xiangyu
    Qi, Xiaojuan
    Jia, Jiaya
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21674 - 21683
  • [5] Deep Learning for Image and Point Cloud Fusion in Autonomous Driving: A Review
    Cui, Yaodong
    Chen, Ren
    Chu, Wenbo
    Chen, Long
    Tian, Daxin
    Li, Ying
    Cao, Dongpu
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (02) : 722 - 739
  • [6] Deng JJ, 2021, AAAI CONF ARTIF INTE, V35, P1201
  • [7] Ding ZP, 2019, LECT NOTES COMPUT SC, V11766, P202, DOI [10.1007/978-3-030-32248-9_23, 10.1007/978-3-030-32248-9]
  • [8] MVDCANet: An End-to-End Self-Attention-Based Multiview-Dualchannel 3D Object Detection
    Gao, Xinwen
    Hu, Daojun
    [J]. IEEE SENSORS JOURNAL, 2021, 21 (24) : 27789 - 27800
  • [9] Vision meets robotics: The KITTI dataset
    Geiger, A.
    Lenz, P.
    Stiller, C.
    Urtasun, R.
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) : 1231 - 1237
  • [10] M3DETR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers
    Guan, Tianrui
    Wang, Jun
    Lan, Shiyi
    Chandra, Rohan
    Wu, Zuxuan
    Davis, Larry
    Manocha, Dinesh
    [J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2293 - 2303