Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation

被引:0
作者
Chen, Yuanpei [1 ]
Wang, Chen [1 ]
Li Fei-Fei [1 ]
Liu, Karen [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
来源
CONFERENCE ON ROBOT LEARNING, VOL 229 | 2023年 / 229卷
基金
美国国家科学基金会;
关键词
Dexterous Manipulation; Long-Horizon Manipulation; Reinforcement Learning; MOTION; TASK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many real-world manipulation tasks consist of a series of subtasks that are significantly different from one another. Such long-horizon, complex tasks highlight the potential of dexterous hands, which possess adaptability and versatility, capable of seamlessly transitioning between different modes of functionality without the need for re-grasping or external tools. However, the challenges arise due to the high-dimensional action space of dexterous hand and complex compositional dynamics of the long-horizon tasks. We present Sequential Dexterity, a general system based on reinforcement learning (RL) that chains multiple dexterous policies for achieving long-horizon task goals. The core of the system is a transition feasibility function that progressively finetunes the sub-policies for enhancing chaining success rate, while also enables autonomous policy-switching for recovery from failures and bypassing redundant stages. Despite being trained only in simulation with a few task objects, our system demonstrates generalization capability to novel object shapes and is able to zero-shot transfer to a real-world robot equipped with a dexterous hand. Code and videos are available at sequential-dexterity.github.io.
引用
收藏
页数:21
相关论文
共 66 条
  • [51] Schmidhuber J., 1990, Towards compositional learning with dynamic neural networks
  • [52] Schulman J., 2017, Proximal policy optimization algorithms, DOI [10.48550/arXiv.1707.06347, DOI 10.48550/ARXIV.1707.06347]
  • [53] Learning Purely Tactile In-Hand Manipulation with a Torque-Controlled Hand
    Sievers, Leon
    Pitz, Johannes
    Baeuml, Berthold
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022, : 2745 - 2751
  • [54] Sivakumar A., 2022, arXiv
  • [55] Learning to Switch Between Sensorimotor Primitives Using Multimodal Haptic Signals
    Su, Zhe
    Kroemer, Oliver
    Loeb, Gerald E.
    Sukhatme, Gaurav S.
    Schaal, Stefan
    [J]. FROM ANIMALS TO ANIMATS 14, 2016, 9825 : 170 - 182
  • [56] Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    Sutton, RS
    Precup, D
    Singh, S
    [J]. ARTIFICIAL INTELLIGENCE, 1999, 112 (1-2) : 181 - 211
  • [57] Waldinger R., 1981, Readings in artificial intelligence, P250
  • [58] Wang C., 2023, arXiv
  • [59] Generalizable Task Planning Through Representation Pretraining
    Wang, Chen
    Xu, Danfei
    Li Fei-Fei
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03) : 8299 - 8306
  • [60] DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion
    Wang, Chen
    Xu, Danfei
    Zhu, Yuke
    Martin-Martin, Roberto
    Lu, Cewu
    Li Fei-Fei
    Savarese, Silvio
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3338 - 3347