SRFDet3D: Sparse Region Fusion based 3D Object Detection

被引:3
作者
Erabati, Gopi Krishna [1 ]
Araujo, Helder [1 ]
机构
[1] Univ Coimbra, Inst Syst & Robot, Rua Silvio Lima Polo 2, P-3030290 Coimbra, Portugal
基金
欧盟地平线“2020”;
关键词
3D object detection; Fusion; Camera; LiDAR; Autonomous driving; Computer vision;
D O I
10.1016/j.neucom.2024.127814
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unlike the earlier 3D object detection approaches that formulate hand-crafted dense (in thousands) object proposals by leveraging anchors on dense feature maps, we formulate np (in hundreds) number of learnable sparse object proposals to predict 3D bounding box parameters. The sparse proposals in our approach are not only learnt during training but also are input-dependent, so they represent better object candidates during inference. Leveraging the sparse proposals, we fuse only the sparse regions of multi-modal features and we propose S parse R egion F usion based 3D object Det ection (SRFDet3D) network with mainly three components: an encoder for feature extraction, a region proposal generation module for sparse input-dependent proposals and a decoder for multi-modal feature fusion and iterative refinement of object proposals. Additionally for optimal training, we formulate our sparse detector with many-to-one label assignment based on Optimal Transport Algorithm (OTA). We conduct extensive experiments and analysis on publicly available large-scale autonomous driving datasets: nuScenes, KITTI, and Waymo. Our LiDAR-only SRFDet3D-L network achieves 63.1 mAP and outperforms the state-of-the-art networks on the nuScenes dataset, surpassing the dense detectors on KITTI and Waymo datasets. Our LiDAR-Camera model SRFDet3D achieves 64.7 mAP with improvements over existing fusion methods.
引用
收藏
页数:15
相关论文
共 67 条
[1]   nuScenes: A multimodal dataset for autonomous driving [J].
Caesar, Holger ;
Bankiti, Varun ;
Lang, Alex H. ;
Vora, Sourabh ;
Liong, Venice Erin ;
Xu, Qiang ;
Krishnan, Anush ;
Pan, Yu ;
Baldan, Giancarlo ;
Beijbom, Oscar .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628
[2]  
Chen L., 2020, Adv Neural Inf Process Syst, P21224
[3]  
Chen Q., 2021, P ADV NEUR INF PROC, V34, P26871
[4]   Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots [J].
Chen, Qi ;
Sun, Lin ;
Wang, Zhixin ;
Jia, Kui ;
Yuille, Alan .
COMPUTER VISION - ECCV 2020, PT XXI, 2020, 12366 :68-84
[5]   Multi-View 3D Object Detection Network for Autonomous Driving [J].
Chen, Xiaozhi ;
Ma, Huimin ;
Wan, Ji ;
Li, Bo ;
Xia, Tian .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534
[6]   FUTR3D: A Unified Sensor Fusion Framework for 3D Detection [J].
Chen, Xuanyao ;
Zhang, Tianyuan ;
Wang, Yue ;
Wang, Yilun ;
Zhao, Hang .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2023, :172-181
[7]  
Chen YL, 2019, IEEE I CONF COMP VIS, P9774, DOI [10.1109/ICCV.2019.00987, 10.1109/iccv.2019.00987]
[8]   Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].
Dai, Angela ;
Qi, Charles Ruizhongtai ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554
[9]   VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention [J].
Deng, Shengheng ;
Liang, Zhihao ;
Sun, Lin ;
Jia, Kui .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :8438-8447
[10]  
Erabati G.K, 2020, Digital Image Computing: Techniques and Applications, P1