BIK-BUS: Biologically Motivated 3D Keypoint Based on Bottom-Up Saliency

被引:18
作者
Filipe, Silvio [1 ]
Itti, Laurent [2 ]
Alexandre, Luis A. [1 ]
机构
[1] Univ Beira Interior, Inst Telecomunicacoes, P-6200001 Covilha, Portugal
[2] Univ So Calif, Dept Comp Sci, Los Angeles, CA 90089 USA
关键词
3D keypoints; 3D interest points; 3D object recognition; performance evaluation; OBJECT RECOGNITION; VISUAL-ATTENTION; SHIFTS; MODEL;
D O I
10.1109/TIP.2014.2371532
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the major problems found when developing a 3D recognition system involves the choice of keypoint detector and descriptor. To help solve this problem, we present a new method for the detection of 3D keypoints on point clouds and we perform benchmarking between each pair of 3D keypoint detector and 3D descriptor to evaluate their performance on object and category recognition. These evaluations are done in a public database of real 3D objects. Our keypoint detector is inspired by the behavior and neural architecture of the primate visual system. The 3D keypoints are extracted based on a bottom-up 3D saliency map, that is, a map that encodes the saliency of objects in the visual environment. The saliency map is determined by computing conspicuity maps (a combination across different modalities) of the orientation, intensity, and color information in a bottom-up and in a purely stimulus-driven manner. These three conspicuity maps are fused into a 3D saliency map and, finally, the focus of attention (or keypoint location) is sequentially directed to the most salient points in this map. Inhibiting this location automatically allows the system to attend to the next most salient location. The main conclusions are: with a similar average number of keypoints, our 3D keypoint detector outperforms the other eight 3D keypoint detectors evaluated by achieving the best result in 32 of the evaluated metrics in the category and object recognition experiments, when the second best detector only obtained the best result in eight of these metrics. The unique drawback is the computational time, since biologically inspired 3D keypoint based on bottom-up saliency is slower than the other detectors. Given that there are big differences in terms of recognition performance, size and time requirements, the selection of the keypoint detector and descriptor has to be matched to the desired task and we give some directions to facilitate this choice.
引用
收藏
页码:163 / 175
页数:13
相关论文
共 44 条
[1]  
Aldoma Aitor, 2012, Pattern Recognition. Proceedings Joint 34th DAGM and 36th OAGM Symposium, P113, DOI 10.1007/978-3-642-32717-9_12
[2]  
Aldoma A., 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), P585, DOI 10.1109/ICCVW.2011.6130296
[3]   Tutorial Point Cloud Library Three-Dimensional Object Recognition and 6 DOF Pose Estimation [J].
Aldoma, Aitor ;
Marton, Zoltan-Csaba ;
Tombari, Federico ;
Wohlkinger, Walter ;
Potthast, Christian ;
Zeisl, Bernhard ;
Rusu, Radu Bogdan ;
Gedikli, Suat ;
Vincze, Markus .
IEEE ROBOTICS & AUTOMATION MAGAZINE, 2012, 19 (03) :80-91
[4]  
Alexandre L.A., 2013, Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, P57
[5]  
Alexandre Luis A., 2012, Em: Workshop on Color-Depth Camera Fusion in Robotics at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura, Portugal, V1, P7
[6]  
[Anonymous], THESIS OXFORD U OXFO
[7]  
[Anonymous], 2009, IEEE INT C ROB AUT
[8]   Shape matching and object recognition using shape contexts [J].
Belongie, S ;
Malik, J ;
Puzicha, J .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (04) :509-522
[9]  
Bermudez JoseLuis., 2010, Cognitive Science: An Introduction to the Science of the Mind
[10]   A model for inhibitory lateral interaction effects in perceived contrast [J].
Cannon, MW ;
Fullenkamp, SC .
VISION RESEARCH, 1996, 36 (08) :1115-1125