An End-to-End Deep Learning Network for 3D Object Detection From RGB-D Data Based on Hough Voting

被引:15
作者
Yan, Ming [1 ,2 ]
Li, Zhongtong [2 ]
Yu, Xinyan [3 ]
Jin, Cong [2 ]
机构
[1] Commun Univ China, State Key Lab Media Convergence & Commun, Beijing 100024, Peoples R China
[2] Commun Univ China, Sch Informat & Telecommun Engn, Beijing 100024, Peoples R China
[3] Commun Univ China, Sch Data Sci & Media Intelligence, Beijing 100024, Peoples R China
关键词
Three-dimensional displays; Two dimensional displays; Cameras; Object detection; Streaming media; Machine learning; Robot sensing systems; 3D object detection; RGB-D; Hough voting; PointRCNN; VISION; REPRESENTATION;
D O I
10.1109/ACCESS.2020.3012695
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Existing outdoor three-dimensional (3D) object detection algorithms mainly use a single type of sensor, for example, only using a monocular camera or radar point cloud. However, camera sensors are affected by light and lose depth information. When scanning a distant object or an occluded object, the data collected by the short-range radar point cloud sensor are very sparse, which affects the detection algorithm. To address the above challenges, we design a deep learning network that can combine the texture information of two-dimensional (2D) data and the geometric information of 3D data for object detection. To solve the problem of a single sensor, we use a reverse mapping layer and an aggregation layer to combine the texture information of RGB data with the geometric information of point cloud data and design a maximum pooling layer to deal with the input of multi-view cameras. In addition, to solve the defects of the 3D object detection algorithm based on the region proposal network (RPN) method, we use the Hough voting algorithm implemented by a deep neural network to suggest objects. Experimental results show that our algorithm has a 1.06% decrease in average precision (AP) compared to PointRCNN in easy car object detection, but our algorithm requires 37.7% less time to calculate than PointRCNN under the same hardware environment. Moreover, our algorithm improves the AP by 1.14% compared to PointRCNN in hard car object detection.
引用
收藏
页码:138810 / 138822
页数:13
相关论文
共 38 条
[1]   Low-Power Computer Vision: Status, Challenges, and Opportunities [J].
Alyamkin, Sergei ;
Ardi, Matthew ;
Berg, Alexander C. ;
Brighton, Achille ;
Chen, Bo ;
Chen, Yiran ;
Cheng, Hsin-Pai ;
Fan, Zichen ;
Feng, Chen ;
Fu, Bo ;
Gauen, Kent ;
Goel, Abhinav ;
Goncharenko, Alexander ;
Guo, Xuyang ;
Ha, Soonhoi ;
Howard, Andrew ;
Hu, Xiao ;
Huang, Yuanjun ;
Kim, Jaeyoun ;
Ko, Jong Gook ;
Kondratyev, Alexander ;
Lee, Junhyeok ;
Lee, Seungjae ;
Lee, Suwoong ;
Li, Zichao ;
Liang, Zhiyu ;
Liu, Juzheng ;
Liu, Xin ;
Lu, Yang ;
Lu, Yung-Hsiang ;
Malik, Deeptanshu ;
Nguyen, Hong Hanh ;
Park, Eunbyung ;
Repin, Denis ;
Shen, Liang ;
Sheng, Tao ;
Sun, Fei ;
Svitov, David ;
Thiruvathukal, George K. ;
Zhang, Baiwu ;
Zhang, Jingchi ;
Zhang, Xiaopeng ;
Zhuo, Shaojie ;
Kang, D. .
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (02) :411-421
[2]  
[Anonymous], 2020, NEURAL PROCESS 0609, DOI DOI 10.1007/S11063-020-10241-8
[3]   Pointwise Convolutional Neural Networks [J].
Binh-Son Hua ;
Minh-Khoi Tran ;
Yeung, Sai-Kit .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :984-993
[4]  
Chen X, 2015, CORR, V1504, P325
[5]  
Chen X., 2015, P ADV NEUR INF PROC, P424
[6]   Multi-View 3D Object Detection Network for Autonomous Driving [J].
Chen, Xiaozhi ;
Ma, Huimin ;
Wan, Ji ;
Li, Bo ;
Xia, Tian .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534
[7]   Monocular 3D Object Detection for Autonomous Driving [J].
Chen, Xiaozhi ;
Kundu, Kaustav ;
Zhang, Ziyu ;
Ma, Huimin ;
Fidler, Sanja ;
Urtasun, Raquel .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2147-2156
[8]  
Chu X., ARXIV190801314
[9]   Point signatures: A new representation for 3D object recognition [J].
Chua, CS ;
Jarvis, R .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 1997, 25 (01) :63-85
[10]  
Cui X., 2017, P INT C COMP TECHN E, P1093