PVONet: point-voxel-based semi-supervision monocular three-dimensional object detection using LiDAR camera systems

被引：2

作者：

Wang, Haosen ^{[1
]}

Ji, Xiaohang ^{[2
]}

Peng, Kejin ^{[2
]}

Wang, Wanqiu ^{[3
]}

Wang, Shifeng ^{[1
,2
]}

机构：

[1] Changchun Univ Sci & Technol, Sch Optoelect Engn, Zhongshan Inst, Changchun, Peoples R China

[2] Changchun Univ Sci & Technol, Changchun, Peoples R China

[3] Changchun Univ Sci & Technol, Opt Engn, Changchun, Peoples R China

来源：

JOURNAL OF ELECTRONIC IMAGING | 2023年 / 32卷 / 05期

关键词：

semi-supervised; LiDAR camera system; three-dimensional object detection; feature extraction;

D O I：

10.1117/1.JEI.32.5.053015

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Light detection and ranging (LiDAR) camera systems are becoming increasingly vital for autonomous driving. The monocular three-dimensional (3D) detection task is a critical and challenging aspect of this field. However, most algorithms rely solely on manually labeled images, which is a time-consuming and labor-intensive process, and the resulting detection lacks depth information. To address this problem, a semi-supervised 3D object detection model based on LiDAR camera systems (PVONet) is proposed to improve both the detection accuracy and processing time. First, an innovative data preparation block point-voxel fusion estimation is introduced; it utilizes LiDAR points to generate 3D bounding boxes for unlabeled data, thereby significantly reducing the time compared with manual labeling. Second, a new block based on fully connected neural network for box estimation (feature extraction and 3D object detection) is presented; it is utilized to conduct feature extraction, feature correlation, and 3D box estimation on monocular images. Finally, comprehensive experiments conducted on the popular KITTI 3D detection dataset demonstrate that our PVONet is faster (30 ms on KITTI benchmark) and more accurate [with increases of 4.69%/3.82% (easy), 4.45%/2.79% (moderate), and 4.07%/3.75% (hard) aggregation processes on 3D/bird's eye view objects compared with the baseline]. This meets the requirements for high real-time performance in autonomous vehicles applications. The results demonstrate the effectiveness of our model based on LiDAR camera systems. (c) 2023 SPIE and IS&T

引用

页数：15

共 38 条

[1] Balatkan Eren, 2021, 2021 6th International Conference on Computer Science and Engineering (UBMK), P419, DOI 10.1109/UBMK52708.2021.9558880
[2] GAC3D: improving monocular 3D object detection with ground-guide model and adaptive convolution
Bui, Minh-Quan Viet
Ngo, Duc Tuan
Pham, Hoang-Anh
Nguyen, Duc Dung
[J]. PEERJ COMPUTER SCIENCE, 2021, 7
[3] CMAN: Leaning Global Structure Correlation for Monocular 3D Object Detection
Cao, Yuanzhouhan
Zhang, Hui
Li, Yidong
Ren, Chao
Lang, Congyan
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (12) : 24727 - 24737
[4] Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image
Chabot, Florian
Chaouch, Mohamed
Rabarisoa, Jaonary
Teuliere, Celine
Chateau, Thierry
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1827 - 1836
[5] Multi-View 3D Object Detection Network for Autonomous Driving
Chen, Xiaozhi
Ma, Huimin
Wan, Ji
Li, Bo
Xia, Tian
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6526 - 6534
[6] Monocular 3D Object Detection for Autonomous Driving
Chen, Xiaozhi
Kundu, Kaustav
Zhang, Ziyu
Ma, Huimin
Fidler, Sanja
Urtasun, Raquel
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2147 - 2156
[7] Chen XZ, 2015, ADV NEUR IN, V28
[8] MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships
Chen, Yongjian
Tai, Lei
Sun, Kai
Li, Mingyang
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 12090 - 12099
[9] Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance Voting
Chu, Xiaomeng
Deng, Jiajun
Li, Yao
Yuan, Zhenxun
Zhang, Yanyong
Ji, Jianmin
Zhang, Yu
[J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5239 - 5247
[10] Randaugment: Practical automated data augmentation with a reduced search space
Cubuk, Ekin D.
Zoph, Barret
Shlens, Jonathon
Le, Quoc, V
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 3008 - 3017

← 1 2 3 4 →