Category-Level Metric Scale Object Shape and Pose Estimation

被引:40
作者
Lee, Taeyeop [1 ]
Lee, Byeong-Uk [1 ]
Kim, Myungchul [1 ]
Kweon, I. S. [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Robot & Comp Vis Lab, Daejeon 35200, South Korea
关键词
Robot manipulation; augmented reality; object shape estimation; object pose estimation;
D O I
10.1109/LRA.2021.3110538
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Advances in deep learning recognition have led to accurate object detection with 2D images. However, these 2D perception methods are insufficient for complete 3D world information. Concurrently, advanced 3D shape estimation approaches focus on the shape itself, without considering metric scale. These methods cannot determine the accurate location and orientation of objects. To tackle this problem, we propose a framework that jointly estimates a metric scale shape and pose from a single RGB image. Our framework has two branches: the Metric Scale Object Shape branch (MSOS) and the Normalized Object Coordinate Space branch (NOCS). The MSOS branch estimates the metric scale shape observed in the camera coordinates. The NOCS branch predicts the normalized object coordinate space (NOCS) map and performs similarity transformation with the rendered depth map from a predicted metric scale mesh to obtain 6D pose and size. Additionally, we introduce the Normalized Object Center Estimation (NOCE) to estimate the geometrically aligned distance from the camera to the object center. We validated our method on both synthetic and real-world datasets to evaluate category-level object pose and shape.
引用
收藏
页码:8575 / 8582
页数:8
相关论文
共 32 条
[11]   DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image [J].
Kurenkov, Andrey ;
Ji, Jingwei ;
Garg, Animesh ;
Mehta, Viraj ;
Gwak, JunYoung ;
Choy, Christopher ;
Savarese, Silvio .
2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, :858-866
[12]   EPnP: An Accurate O(n) Solution to the PnP Problem [J].
Lepetit, Vincent ;
Moreno-Noguer, Francesc ;
Fua, Pascal .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2009, 81 (02) :155-166
[13]   CDPN: Coordinates-Based Disentangled Pose Network for Real-Time RGB-Based 6-DoF Object Pose Estimation [J].
Li, Zhigang ;
Wang, Gu ;
Ji, Xiangyang .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :7677-7686
[14]   3D-PSRNet: Part Segmented 3D Point Cloud Reconstruction from a Single Image [J].
Mandikal, Priyanka ;
Navaneet, K. L. ;
Babu, R. Venkatesh .
COMPUTER VISION - ECCV 2018 WORKSHOPS, PT III, 2019, 11131 :662-674
[15]  
Morrison D, 2018, IEEE INT CONF ROBOT, P7757
[16]   Deep Mesh Reconstruction from Single RGB Images via Topology Modification Networks [J].
Pan, Junyi ;
Han, Xiaoguang ;
Chen, Weikai ;
Tang, Jiapeng ;
Jia, Kui .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9963-9972
[17]   Pix2Pose: Pixel-Wise Coordinate Regression of Objects for 6D Pose Estimation [J].
Park, Kiru ;
Patten, Timothy ;
Vincze, Markus .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :7667-7676
[18]  
Paszke A, ADV NEURAL INFORM PR, V32, P8026
[19]   PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation [J].
Peng, Sida ;
Liu, Yuan ;
Huang, Qixing ;
Zhou, Xiaowei ;
Bao, Hujun .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4556-4565
[20]   BB8: A Scalable, Accurate, Robust to Partial Occlusion Method for Predicting the 3D Poses of Challenging Objects without Using Depth [J].
Rad, Mahdi ;
Lepetit, Vincent .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3848-3856