Learning Affordance Space in Physical World for Vision-based Robotic Object Manipulation

被引：0

作者：

Wu, Huadong ^{[1
]}

Zhang, Zhanpeng ^{[2
]}

Cheng, Hui ^{[1
]}

Yang, Kai ^{[1
]}

Liu, Jiaming ^{[2
]}

Guo, Ziying ^{[1
]}

机构：

[1] Sun Yat Sen Univ, Guangzhou, Peoples R China

[2] SenseTime Grp Ltd, Hong Kong, Peoples R China

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) | 2020年

关键词：

D O I：

10.1109/icra40945.2020.9196783

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

What is a proper representation for objects in manipulation? What would human try to perceive when manipulating a new object in a new environment? In fact, instead of focusing on the texture and illumination, human can infer the "affordance" [36] of the objects from vision. Here "affordance" describes the object's intrinsic property that affords a particular type of manipulation. In this work, we investigate whether such affordance can be learned by a deep neural network. In particular, we propose an Affordance Space Perception Network (ASPN) that takes an image as input and outputs an affordance map. Different from existing works that infer the pixel-wise probability affordance map in image space, our affordance is defined in the real world space, thus eliminates the need of hand-eye calibration. In addition, we extend the representation ability of affordance by defining it in a 3D affordance space and propose a novel training strategy to improve the performance. Trained purely with simulation data, ASPN can achieve significant performance in the real world. It is a task-agnostic framework and can handle different objects, scenes and viewpoints. Extensive real-world experiments demonstrate the accuracy and robustness of our approach. We achieve the success rates of 94.2% for singular-object pushing and 92.4% for multiple-object pushing. We also achieve the success rates of 97.2% for singular-object grasping and 95.4% for multiple-object grasping, which outperform current state-of-the-art methods.

引用

页码：4652 / 4658

页数：7

共 38 条

[11]

Jaramillo-Cabrera E., 2019, ADAPTIVE BEHAV, V27

[12]

Johns E, 2016, 2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), P4461, DOI 10.1109/IROS.2016.7759657

[13]

Kalashnikov D, 2018, PR MACH LEARN RES, V87

[14] Deep learning for detecting robotic grasps [J].

Lenz, Ian ;

Lee, Honglak ;

Saxena, Ashutosh .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2015, 34 (4-5) :705-724

[15] Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection [J].

Levine, Sergey ;

Pastor, Peter ;

Krizhevsky, Alex ;

Ibarz, Julian ;

Quillen, Deirdre .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2018, 37 (4-5) :421-436

[16]

Mahler J, 2017, ROBOTICS: SCIENCE AND SYSTEMS XIII

[17]

Mahler J, 2018, IEEE INT CONF ROBOT, P5620

[18]

Matas Jan, 2018, PR MACH LEARN RES, V87

[19] Human-level control through deep reinforcement learning [J].

Mnih, Volodymyr ;

Kavukcuoglu, Koray ;

Silver, David ;

Rusu, Andrei A. ;

Veness, Joel ;

Bellemare, Marc G. ;

Graves, Alex ;

Riedmiller, Martin ;

Fidjeland, Andreas K. ;

Ostrovski, Georg ;

Petersen, Stig ;

Beattie, Charles ;

Sadik, Amir ;

Antonoglou, Ioannis ;

King, Helen ;

Kumaran, Dharshan ;

Wierstra, Daan ;

Legg, Shane ;

Hassabis, Demis .

NATURE, 2015, 518 (7540) :529-533

[20]

Morrison D, 2018, ROBOTICS: SCIENCE AND SYSTEMS XIV

← 1 2 3 4 →