Sim-to-Real Visual Grasping via State Representation Learning Based on Combining Pixel-Level and Feature-Level Domain Adaptation

被引：4

作者：

Park, Youngbin ^{[1
]}

Lee, Sang Hyoung ^{[2
]}

Suh, Il Hong ^{[3
]}

机构：

[1] Hanyang Univ, Dept Elect & Comp Engn, Seoul, South Korea

[2] Korea Inst Ind Technol, Innovat Smart Mfg R&D Dept, Seoul, South Korea

[3] CogAplex Co Ltd, Seoul, South Korea

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021) | 2021年

关键词：

D O I：

10.1109/ICRA48506.2021.9561302

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this study, we present a method to grasp diverse unseen real-world objects using an off-policy actor-critic deep reinforcement learning (RL) with the help of a simulation and the use of as little real-world data as possible. Actor-critic deep RL is unstable and difficult to tune when a raw image is given as an input. Therefore, we use state representation learning (SRL) to make actor-critic RL feasible for visual grasping tasks. Meanwhile, to reduce visual reality gap between simulation and reality, we also employ a typical pixel-level domain adaptation that can map simulated images to realistic ones. In our method, as the SRL model is a common preprocessing module for simulated and real-world data, we perform SRL using real and adapted images. This pixel-level domain adaptation enables the robot to learn grasping skills in a real environment using small amounts of real-world data. However, the controller trained in the simulation should adapt to the real world efficiently. Hence, we propose a method combining a typical pixel-level domain adaptation and the proposed SRL model, where we perform SRL based on a feature-level domain adaptation. In evaluations of vision-based robotics grasping tasks, we show that the proposed method achieves a substantial improvement over a method that only employs a pixel-level or domain adaptation.

引用

页码：6300 / 6307

页数：8