Three-Dimensional Pedestrian Detection by Fusing Image Semantics and Point Cloud Spatial Visibility Features

被引:0
作者
Xiong Lu [1 ]
Deng Zhenwen [1 ]
Tian Wei [1 ]
Wang Zhiang [1 ]
机构
[1] Tongji Univ, Sch Automot Studies, Shanghai 201804, Peoples R China
关键词
object detection; image and point cloud fusion; point cloud spatial visibility; intelligent driving and environmental perception;
D O I
10.3788/LOP220712
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Vehicular light detection and ranging (LiDAR) has become a standard sensor in automotive by offering accurate geometric information of the surrounding region for intelligent driving vehicles. In order to overcome the limited performance of a single sensor for object detection, the geometric and spatial visibility features of LiDAR point clouds are fused with image semantic information in a network framework to achieve accurate three dimensional (3D) pedestrian detection. First, an effective 3D ray-casting algorithm is introduced to produce spatial visibility feature encodings. Second, the image semantic information is incorporated to improve point cloud features. Finally, the impact of added information and related hyperparameters on detection findings are quantitatively and qualitatively examined. Experimental findings demonstrate that compared with the single frame point cloud, the 3D pedestrian detection accuracy is enhanced by 32.63 percentage points after aggregating the last 10 frames of the point cloud in history. By further fusing image semantics and point cloud spatial visibility information, the proposed method's detection accuracy is enhanced by 2.42 percentage points compared with the benchmark approach, and exceeds some standard approaches. Our enhanced approach is more suitable for 3D pedestrian detection in a traffic environment.
引用
收藏
页数:8
相关论文
共 19 条
[1]   Voxelisation Algorithms and Data Structures: A Review [J].
Aleksandrov, Mitko ;
Zlatanova, Sisi ;
Heslop, David J. .
SENSORS, 2021, 21 (24)
[2]   nuScenes: A multimodal dataset for autonomous driving [J].
Caesar, Holger ;
Bankiti, Varun ;
Lang, Alex H. ;
Vora, Sourabh ;
Liong, Venice Erin ;
Xu, Qiang ;
Krishnan, Anush ;
Pan, Yu ;
Baldan, Giancarlo ;
Beijbom, Oscar .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628
[3]  
Chen LC, 2018, Arxiv, DOI arXiv:1802.02611
[4]  
Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
[5]   OctoMap: an efficient probabilistic 3D mapping framework based on octrees [J].
Hornung, Armin ;
Wurm, Kai M. ;
Bennewitz, Maren ;
Stachniss, Cyrill ;
Burgard, Wolfram .
AUTONOMOUS ROBOTS, 2013, 34 (03) :189-206
[6]   What You See is What You Get: Exploiting Visibility for 3D Object Detection [J].
Hu, Peiyun ;
Ziglar, Jason ;
Held, David ;
Ramanan, Deva .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10998-11006
[7]  
Ku J, 2018, IEEE INT C INT ROBOT, P5750, DOI 10.1109/IROS.2018.8594049
[8]   PointPillars: Fast Encoders for Object Detection from Point Clouds [J].
Lang, Alex H. ;
Vora, Sourabh ;
Caesar, Holger ;
Zhou, Lubing ;
Yang, Jiong ;
Beijbom, Oscar .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :12689-12697
[9]   Frustum PointNets for 3D Object Detection from RGB-D Data [J].
Qi, Charles R. ;
Liu, Wei ;
Wu, Chenxia ;
Su, Hao ;
Guibas, Leonidas J. .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :918-927
[10]   PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation [J].
Qi, Charles R. ;
Su, Hao ;
Mo, Kaichun ;
Guibas, Leonidas J. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :77-85