SliceNet: A proficient model for real-time 3D shape-based recognition

被引：15

作者：

Chen, Xuzhan ^{[1
,2
]}

Chen, Youping ^{[1
]}

Gupta, Kashish ^{[2
]}

Zhou, Jie ^{[2
,3
]}

Najjaran, Homayoun ^{[2
]}

机构：

[1] Huazhong Univ Sci & Technol, Sch Mech Sci & Engn, Wuhan 430074, Hubei, Peoples R China

[2] Univ British Columbia, Sch Engn, Kelowna, BC V1V 1V7, Canada

[3] Jilin Univ, Commun Engn, Changchun 130012, Jilin, Peoples R China

来源：

NEUROCOMPUTING | 2018年 / 316卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

3D recognition; 3D convolution network; Volumetric image; Real-time recognition;

D O I：

10.1016/j.neucom.2018.07.061

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The field of 3D object recognition has been dominated by 2D view-based methods mostly because of lower accuracy and larger computational load of 3D shape-based methods. Recognition with a 3D shape yields appreciable advantages e.g., making use of depth information and independence to ambient lighting, but we are still away from an eminent solution for 3D shape-based object recognition. In this paper first, a statistical method capable of modeling the input and output with random variables is used to investigate the reasons contributing to the inferior performance of the 3D convolution operation. The analysis suggests that the excessive size of the kernel causes the dramatic blowing up of the output variance of the 3D convolution operation and makes the output feature less discriminating. Then, based on the results of this analysis and inspired by the underlying principle of 3D shapes, SliceNet is proposed to learn 3D shape features using anisotropic 3D convolution. Specifically, the proposed method learns features from original 2D planar sketches comprising the 3D shape and has a significantly lower output variance. Experiments on ModelNet show that the recognition accuracy of the proposed SliceNet is comparable to well-established 2D view-based methods. Besides, the SliceNet also has a significantly smaller model size, simpler architecture, less training and inference time compared to 2D view-based and other 3D object recognition methods. An experiment with real-world data shows that the model trained on CAD files can be generalized to real-world objects without any re-training or fine-tuning. (C) 2018 Elsevier B.V. All rights reserved.

引用

页码：144 / 155

页数：12

共 37 条

[1]

[Anonymous], 160202481 ARXIV

[2]

[Anonymous], 2017, P EUR WORKSH 3D OBJ

[3]

[Anonymous], 2015, IEEE C COMPUTER VISI, DOI DOI 10.1109/CVPR.2015.7298801

[4]

[Anonymous], 161200593 ARXIV

[5] RGB-D Object Recognition and Grasp Detection Using Hierarchical Cascaded Forests [J].

Asif, Umar ;

Bennamoun, Mohammed ;

Sohel, Ferdous A. .

IEEE TRANSACTIONS ON ROBOTICS, 2017, 33 (03) :547-564

[6]

Bo L., 2013, EXPT ROBOTICS, P387, DOI DOI 10.1007/978-3-319-00065-7

[7]

Boulch A., 2017, P EUR WORKSH 3D OBJ, V2

[8] 3D shape recognition and retrieval based on multi-modality deep learning [J].

Bu, Shuhui ;

Wang, Lei ;

Han, Pengcheng ;

Liu, Zhenbao ;

Li, Ke .

NEUROCOMPUTING, 2017, 259 :183-193

[9]

Chen XZ, 2017, IEEE INT C INT ROBOT, P783, DOI 10.1109/IROS.2017.8202239

[10]

Eitel A, 2015, IEEE INT C INT ROBOT, P681, DOI 10.1109/IROS.2015.7353446

← 1 2 3 4 →