Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation

被引：0

作者：

Chen, Yuanpei ^{[1
]}

Wang, Chen ^{[1
]}

Li Fei-Fei ^{[1
]}

Liu, Karen ^{[1
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

来源：

CONFERENCE ON ROBOT LEARNING, VOL 229 | 2023年 / 229卷

基金：

美国国家科学基金会;

关键词：

Dexterous Manipulation; Long-Horizon Manipulation; Reinforcement Learning; MOTION; TASK;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Many real-world manipulation tasks consist of a series of subtasks that are significantly different from one another. Such long-horizon, complex tasks highlight the potential of dexterous hands, which possess adaptability and versatility, capable of seamlessly transitioning between different modes of functionality without the need for re-grasping or external tools. However, the challenges arise due to the high-dimensional action space of dexterous hand and complex compositional dynamics of the long-horizon tasks. We present Sequential Dexterity, a general system based on reinforcement learning (RL) that chains multiple dexterous policies for achieving long-horizon task goals. The core of the system is a transition feasibility function that progressively finetunes the sub-policies for enhancing chaining success rate, while also enables autonomous policy-switching for recovery from failures and bypassing redundant stages. Despite being trained only in simulation with a few task objects, our system demonstrates generalization capability to novel object shapes and is able to zero-shot transfer to a real-world robot equipped with a dexterous hand. Code and videos are available at sequential-dexterity.github.io.

引用

页数：21

共 66 条

[51] Schmidhuber J., 1990, Towards compositional learning with dynamic neural networks
[52] Schulman J., 2017, Proximal policy optimization algorithms, DOI [10.48550/arXiv.1707.06347, DOI 10.48550/ARXIV.1707.06347]
[53] Learning Purely Tactile In-Hand Manipulation with a Torque-Controlled Hand
Sievers, Leon
Pitz, Johannes
Baeuml, Berthold
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022, : 2745 - 2751
[54] Sivakumar A., 2022, arXiv
[55] Learning to Switch Between Sensorimotor Primitives Using Multimodal Haptic Signals
Su, Zhe
Kroemer, Oliver
Loeb, Gerald E.
Sukhatme, Gaurav S.
Schaal, Stefan
[J]. FROM ANIMALS TO ANIMATS 14, 2016, 9825 : 170 - 182
[56] Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
Sutton, RS
Precup, D
Singh, S
[J]. ARTIFICIAL INTELLIGENCE, 1999, 112 (1-2) : 181 - 211
[57] Waldinger R., 1981, Readings in artificial intelligence, P250
[58] Wang C., 2023, arXiv
[59] Generalizable Task Planning Through Representation Pretraining
Wang, Chen
Xu, Danfei
Li Fei-Fei
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03) : 8299 - 8306
[60] DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion
Wang, Chen
Xu, Danfei
Zhu, Yuke
Martin-Martin, Roberto
Lu, Cewu
Li Fei-Fei
Savarese, Silvio
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3338 - 3347

← 1 2 3 4 5 6 7 →