Enhancing small object detection in point clouds with self-attention voting network

被引:0
作者
Zhu, Minghao [1 ]
Wang, Gaihua [2 ]
Li, Mingjie [1 ]
Long, Qian [2 ]
Zhou, Zhengshu [2 ]
机构
[1] Hubei Univ Technol, Sch Elect & Elect Engn, Wuhan, Peoples R China
[2] Tianjin Univ Sci & Technol, Coll Artificial Intelligence, Tianjin, Peoples R China
关键词
3D object detection; point cloud; self-attention; center point voting; multi-scale;
D O I
10.1117/1.OE.63.4.043105
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
The development of point cloud-based object detection in the field of autonomous driving has been rapid. However, it is undeniable that the issue of detecting small objects with high precision remains an urgent challenge. To address this issue, we introduce a single-stage 3D detection network, termed self-attention voting-single stage detection (SAV-SSD). It directly extracts feature information from the raw point cloud data and introduces an innovative self-attention voting mechanism to generate center points through weighted voting based on feature correlations. Compared with the feature prediction, we make an additional prediction of the center point, which can better control the position and size of the bounding boxes to improve the accuracy and stability of the predictions. To capture more features of small objects, cross multi-scale feature fusion is designed to establish connections between deep and shallow features. Experimental results demonstrate that SAV-SSD significantly improves the accuracy of pedestrian and cyclist detection while maintaining real-time performance. On the KITTI dataset, SAV-SSD outperforms many state-of-the-art 3D object detection methods.
引用
收藏
页数:13
相关论文
共 50 条
[1]   VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking [J].
Chen, Yukang ;
Liu, Jianhui ;
Zhang, Xiangyu ;
Qi, Xiaojuan ;
Jia, Jiaya .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :21674-21683
[2]  
Deng JJ, 2021, AAAI CONF ARTIF INTE, V35, P1201
[3]   Structure Aware Single-stage 3D Object Detection from Point Cloud [J].
He, Chenhang ;
Zeng, Hui ;
Huang, Jianqiang ;
Hua, Xian-Sheng ;
Zhang, Lei .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11870-11879
[4]  
Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/TPAMI.2019.2913372, 10.1109/CVPR.2018.00745]
[5]   Point Density-Aware Voxels for LiDAR 3D Object Detection [J].
Hu, Jordan S. K. ;
Kuai, Tianshu ;
Waslander, Steven L. .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :8459-8468
[6]   Mask Scoring R-CNN [J].
Huang, Zhaojin ;
Huang, Lichao ;
Gong, Yongchao ;
Huang, Chang ;
Wang, Xinggang .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6402-6411
[7]   PSA-Det3D: Pillar set abstraction for 3D object detection [J].
Huang, Zhicong ;
Zheng, Zhijie ;
Zhao, Jingwen ;
Hu, Haifeng ;
Wang, Zixin ;
Chen, Dihu .
PATTERN RECOGNITION LETTERS, 2023, 168 :138-145
[8]  
Jaderberg M, 2015, ADV NEUR IN, V28
[9]   Transformers in Vision: A Survey [J].
Khan, Salman ;
Naseer, Muzammal ;
Hayat, Munawar ;
Zamir, Syed Waqas ;
Khan, Fahad Shahbaz ;
Shah, Mubarak .
ACM COMPUTING SURVEYS, 2022, 54 (10S)
[10]   Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction [J].
Ku, Jason ;
Pon, Alex D. ;
Waslander, Steven L. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11859-11868