A novel three-dimensional object detection with the modified You Only Look Once

被引：13

作者：

Zhao, Xia ^{[1
]}

Jia, Haihang ^{[1
]}

Ni, Yingting ^{[1
]}

机构：

[1] Tongji Univ, Coll Elect & Informat Engn, Shanghai, Peoples R China

来源：

INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS | 2018年 / 15卷 / 02期

关键词：

Convolutional neural network; object detection; cluster box; coordinate transformation; 3D object bounding box;

D O I：

10.1177/1729881418765507

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Three-dimensional object detection aims to produce a three-dimensional bounding box of an object at its full extent. Nowadays, three-dimensional object detection is mainly based on red green blue-depth (RGB-D) images. However, it remains an open problem because of the difficulty in labeling for three-dimensional training data. In this article, we present a novel three-dimensional object detection method based on two-dimensional object detection, which only takes a set of RGB images as input. First, aiming at the requirement of three-dimensional object detection and the low location accuracy of You Only Look Once, a modified two-dimensional object detection method based on You Only Look Once is proposed. Then, using a set of images from different visual angles, three-dimensional geometric data are reconstructed. In addition, making use of the modified You Only Look Once method, the two-dimensional object bounding boxes of the forward and side views are obtained. Finally, according to the transformation between the two-dimensional pixel coordinate and the three-dimensional space coordinate, the two-dimensional object bounding box is mapped onto the reconstructed three-dimensional scene to form the three-dimensional object box. Because this method only needs the collection of two-dimensional images to train the modified You Only Look Once model, it has a wide range of applications. The experimental results show that the modified You Only Look Once model can improve the location accuracy, and our algorithm can effectively realize the three-dimensional object detection without depth images.

引用

页数：13

共 15 条

[1] SLIC Superpixels Compared to State-of-the-Art Superpixel Methods [J].

Achanta, Radhakrishna ;

Shaji, Appu ;

Smith, Kevin ;

Lucchi, Aurelien ;

Fua, Pascal ;

Suesstrunk, Sabine .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (11) :2274-2281

[2]

[Anonymous], EUR C COMP VIS

[3] Monocular 3D Object Detection for Autonomous Driving [J].

Chen, Xiaozhi ;

Kundu, Kaustav ;

Zhang, Ziyu ;

Ma, Huimin ;

Fidler, Sanja ;

Urtasun, Raquel .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2147-2156

[4]

Fuhrmann S, 2014, 2014 EUR WORKSH GRAP, P11

[5] Fast R-CNN [J].

Girshick, Ross .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448

[6] Rich feature hierarchies for accurate object detection and semantic segmentation [J].

Girshick, Ross ;

Donahue, Jeff ;

Darrell, Trevor ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587

[7]

Li B, 2016, ROBOTICS: SCIENCE AND SYSTEMS XII

[8] Feature Pyramid Networks for Object Detection [J].

Lin, Tsung-Yi ;

Dollar, Piotr ;

Girshick, Ross ;

He, Kaiming ;

Hariharan, Bharath ;

Belongie, Serge .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :936-944

[9] SSD: Single Shot MultiBox Detector [J].

Liu, Wei ;

Anguelov, Dragomir ;

Erhan, Dumitru ;

Szegedy, Christian ;

Reed, Scott ;

Fu, Cheng-Yang ;

Berg, Alexander C. .

COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 :21-37

[10]

Liu W, 2015, PROC CVPR IEEE, P3013, DOI 10.1109/CVPR.2015.7298920

← 1 2 →