Hierarchical policy with deep-reinforcement learning for nonprehensile multiobject rearrangement

被引：6

作者：

Bai, Fan ^{[1
]}

Meng, Fei ^{[1
]}

Liu, Jianbang ^{[1
]}

Wang, Jiankun ^{[2
]}

Meng, Max Q. -H. ^{[1
,2
,3
]}

机构：

[1] Chinese Univ Hong Kong, Dept Elect Engn, Shatin, Hong Kong, Peoples R China

[2] Southern Univ Sci & Technol, Dept Elect & Elect Engn, Shenzhen, Peoples R China

[3] Chinese Univ Hong Kong, Shenzhen Res Inst, Shenzhen, Peoples R China

来源：

BIOMIMETIC INTELLIGENCE AND ROBOTICS | 2022年 / 2卷 / 03期

关键词：

Rearrangement; Reinforcement learning; Monte Carlo tree search;

D O I：

10.1016/j.birob.2022.100047

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Nonprehensile multiobject rearrangement is the robotic task of planning feasible paths and transferring multiple objects to their predefined target poses without grasping. It must consider how each object reaches the target and the order in which objects move, considerably increasing the complexity of the problem. Thus, we propose a hierarchical policy for nonprehensile multiobject rearrangement based on deep-reinforcement learning. We use imitation learning and reinforcement learning to train a rollout policy. In a high-level policy, the policy network directs the Monte Carlo tree search algorithm to efficiently seek the ideal rearrangement sequence for several items. In a low-level policy, the robot plans the paths according to the order of path primitives and manipulates the objects to approach the target poses one by one. Our experiments show that the proposed method has a higher success rate, fewer steps, and shorter path length than the state-of-the-art methods.

引用

页数：8

共 25 条

[1]

Anderson R., 1988, A robot ping-pong player: experiment in real-time intelligent control

[2]

Bojarski M, 2016, Arxiv, DOI arXiv:1604.07316

[3] Object Rearrangement Using Learned Implicit Collision Functions [J].

Danielczuk, Michael ;

Mousavian, Arsalan ;

Eppner, Clemens ;

Fox, Dieter .

2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, :6010-6017

[4]

Gutiérrez-Giles A, 2018, IEEE ROBOT AUTOM LET, V3, P1136, DOI [10.1109/lra.2018.2801939, 10.1109/LRA.2018.2792403, 10.1109/LRA.2018.2801939]

[5] A FORMAL BASIS FOR HEURISTIC DETERMINATION OF MINIMUM COST PATHS [J].

HART, PE ;

NILSSON, NJ ;

RAPHAEL, B .

IEEE TRANSACTIONS ON SYSTEMS SCIENCE AND CYBERNETICS, 1968, SSC4 (02) :100-+

[6]

Haustein J.A., 2018, WORKSH MACH LEARN RO

[7]

Haustein JA, 2015, IEEE INT CONF ROBOT, P3075, DOI 10.1109/ICRA.2015.7139621

[8]

Huang BC, 2022, Arxiv, DOI arXiv:2105.02857

[9]

Huang E, 2019, IEEE INT CONF ROBOT, P211, DOI [10.1109/icra.2019.8793946, 10.1109/ICRA.2019.8793946]

[10]

King JE, 2016, IEEE INT CONF ROBOT, P3940, DOI 10.1109/ICRA.2016.7487583

← 1 2 3 →