Category-Level Metric Scale Object Shape and Pose Estimation

被引:40
作者
Lee, Taeyeop [1 ]
Lee, Byeong-Uk [1 ]
Kim, Myungchul [1 ]
Kweon, I. S. [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Robot & Comp Vis Lab, Daejeon 35200, South Korea
关键词
Robot manipulation; augmented reality; object shape estimation; object pose estimation;
D O I
10.1109/LRA.2021.3110538
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Advances in deep learning recognition have led to accurate object detection with 2D images. However, these 2D perception methods are insufficient for complete 3D world information. Concurrently, advanced 3D shape estimation approaches focus on the shape itself, without considering metric scale. These methods cannot determine the accurate location and orientation of objects. To tackle this problem, we propose a framework that jointly estimates a metric scale shape and pose from a single RGB image. Our framework has two branches: the Metric Scale Object Shape branch (MSOS) and the Normalized Object Coordinate Space branch (NOCS). The MSOS branch estimates the metric scale shape observed in the camera coordinates. The NOCS branch predicts the normalized object coordinate space (NOCS) map and performs similarity transformation with the rendered depth map from a predicted metric scale mesh to obtain 6D pose and size. Additionally, we introduce the Normalized Object Center Estimation (NOCE) to estimate the geometrically aligned distance from the camera to the object center. We validated our method on both synthetic and real-world datasets to evaluate category-level object pose and shape.
引用
收藏
页码:8575 / 8582
页数:8
相关论文
共 32 条
[1]  
Brock A., 2016, Generative and discriminative Voxel modeling with convolutional neural networks
[2]   Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation [J].
Chen, Dengsheng ;
Li, Jun ;
Wang, Zheng ;
Xu, Kai .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11970-11979
[3]   Point-Based Multi-View Stereo Network [J].
Chen, Rui ;
Han, Songfang ;
Xu, Jing ;
Su, Hao .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1538-1547
[4]  
Chen W., 2019, P 33 INT C NEUR INF, P9609
[5]   Category Level Object Pose Estimation via Neural Analysis-by-Synthesis [J].
Chen, Xu ;
Dong, Zijian ;
Song, Jie ;
Geiger, Andreas ;
Hilliges, Otmar .
COMPUTER VISION - ECCV 2020, PT XXVI, 2020, 12371 :139-156
[6]   3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction [J].
Choy, Christopher B. ;
Xu, Danfei ;
Gwak, Jun Young ;
Chen, Kevin ;
Savarese, Silvio .
COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :628-644
[7]   A Point Set Generation Network for 3D Object Reconstruction from a Single Image [J].
Fan, Haoqiang ;
Su, Hao ;
Guibas, Leonidas .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2463-2471
[8]   RANDOM SAMPLE CONSENSUS - A PARADIGM FOR MODEL-FITTING WITH APPLICATIONS TO IMAGE-ANALYSIS AND AUTOMATED CARTOGRAPHY [J].
FISCHLER, MA ;
BOLLES, RC .
COMMUNICATIONS OF THE ACM, 1981, 24 (06) :381-395
[9]   Mesh R-CNN [J].
Gkioxari, Georgia ;
Malik, Jitendra ;
Johnson, Justin .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9784-9794
[10]   A Papier-Mache Approach to Learning 3D Surface Generation [J].
Groueix, Thibault ;
Fisher, Matthew ;
Kim, Vladimir G. ;
Russell, Bryan C. ;
Aubry, Mathieu .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :216-224