Segmentation based 6D pose estimation using integrated shape pattern and RGB information

被引:0
作者
Chaochen Gu
Qi Feng
Changsheng Lu
Shuxin Zhao
Rui Xu
机构
[1] Shanghai Jiao Tong University,
来源
Pattern Analysis and Applications | 2022年 / 25卷
关键词
Convolutional neural network; Point cloud; Semantic segmentation; Pose estimation;
D O I
暂无
中图分类号
学科分类号
摘要
Point cloud is currently the most typical representation in describing the 3D world. However, recognizing objects as well as the poses from point clouds is still a great challenge due to the property of disordered 3D data arrangement. In this paper, a unified deep learning framework for 3D scene segmentation and 6D object pose estimation is proposed. In order to accurately segment foreground objects, a novel shape pattern aggregation module called PointDoN is proposed, which could learn meaningful deep geometric representations from both Difference of Normals (DoN) and the initial spatial coordinates of point cloud. Our PointDoN is flexible to be applied to any convolutional networks and shows improvements in the popular tasks of point cloud classification and semantic segmentation. Once the objects are segmented, the range of point clouds for each object in the entire scene could be specified, which enables us to further estimate the 6D pose for each object within local region of interest. To acquire good estimate, we propose a new 6D pose estimation approach that incorporates both 2D and 3D features generated from RGB images and point clouds, respectively. Specifically, 3D features are extracted via a CNN-based architecture where the input is XYZ map converted from the initial point cloud. Experiments showed that our method could achieve satisfactory results on the publicly available point cloud datasets in both tasks of segmentation and 6D pose estimation.
引用
收藏
页码:1055 / 1073
页数:18
相关论文
共 36 条
[1]  
Collet A(2011)The moped framework: object recognition and pose estimation for manipulation Int J Robot Res 30 1284-1306
[2]  
Martinez M(2006)Simultaneous object recognition and segmentation from single or multiple model views Int J Comput Vis 67 159-188
[3]  
Srinivasa SS(2017)Learning local shape descriptors from part correspondences with multiview convolutional networks ACM Trans Gr (TOG) 37 1-14
[4]  
Ferrari V(2009)Epnp: an accurate o (n) solution to the pnp problem Int J of Comput Vis 81 155-2083
[5]  
Tuytelaars T(2019)Deep collaborative embedding for social image understanding IEEE Trans Pattern Anal Mach Intell 41 2070-73
[6]  
Van Gool L(2020)Deep transfer neural network using hybrid representations of domain discrepancy Neurocomputing 409 60-727
[7]  
Huang H(2018)Single-camera pose estimation using mirage IET Comput Vis 12 720-11
[8]  
Kalogerakis E(2017)O-cnn: octree-based convolutional neural networks for 3d shape analysis ACM Trans Gr (TOG) 36 1-2355
[9]  
Chaudhuri S(2018)Vision-based pose estimation for textureless space objects by contour points matching IEEE Trans Aerosp Electron Syst 54 2342-undefined
[10]  
Ceylan D(undefined)undefined undefined undefined undefined-undefined