i2c-net: Using Instance-Level Neural Networks for Monocular Category-Level 6D Pose Estimation

被引：14

作者：

Remus, Alberto ^{[1
]}

D'Avella, Salvatore ^{[1
]}

Di Felice, Francesco ^{[1
]}

Tripicchio, Paolo ^{[1
]}

Avizzano, Carlo Alberto ^{[1
]}

机构：

[1] Scuola Super Santana, Mech Intelligence Inst, Dept Excellence Robot & AI, I-56127 PI Pisa, Italy

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2023年 / 8卷 / 03期

关键词：

Three-dimensional displays; Pose estimation; Solid modeling; Robots; Grasping; Training; Image reconstruction; Perception for grasping and manipulation; deep learning for visual perception; RGB-D perception;

D O I：

10.1109/LRA.2023.3240362

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Object detection and pose estimation are strict requirements for many robotic grasping and manipulation applications to endow robots with the ability to grasp objects with different properties in cluttered scenes and with various lighting conditions. This work proposes the framework i2c-net to extract the 6D pose of multiple objects belonging to different categories, starting from an instance-level pose estimation network and relying only on RGB images. The network is trained on a custom-made synthetic photo-realistic dataset, generated from some base CAD models, opportunely deformed, and enriched with real textures for domain randomization purposes. At inference time, the instance-level network is employed in combination with a 3D mesh reconstruction module, achieving category-level capabilities. Depth information is used for post-processing as a correction. Tests conducted on real objects of the YCB-V and NOCS-REAL datasets outline the high accuracy of the proposed approach.

引用

页码：1515 / 1522

页数：8

共 32 条

[1] Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations [J].