Using Goal-Conditioned Reinforcement Learning With Deep Imitation to Control Robot Arm in Flexible Flat Cable Assembly Task

被引:5
|
作者
Li, Jingchen [1 ]
Shi, Haobin [1 ]
Hwang, Kao-Shing [2 ,3 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Peoples R China
[2] Natl Sun Yat Sen Univ, Dept Elect Engn, Kaohsiung 81164, Taiwan
[3] Kaohsiung Med Univ, Dept Healthcare Adm & Med Informat, Kaohsiung 80708, Taiwan
基金
中国国家自然科学基金;
关键词
Robots; Manipulators; Reinforcement learning; Task analysis; Connectors; Service robots; Production; Deep reinforcement learning; robot arm; intelligent assembly;
D O I
10.1109/TASE.2023.3323307
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Leveraging reinforcement learning on high-precision decision-making in Robot Arm assembly scenes is a desired goal in the industrial community. However, tasks like Flexible Flat Cable (FFC) assembly, which require highly trained workers, pose significant challenges due to sparse rewards and limited learning conditions. In this work, we propose a goal-conditioned self-imitation reinforcement learning method for FFC assembly without relying on a specific end-effector, where both perception and behavior plannings are learned through reinforcement learning. We analyze the challenges faced by Robot Arm in high-precision assembly scenarios and balance the breadth and depth of exploration during training. Our end-to-end model consists of hindsight and self-imitation modules, allowing the Robot Arm to leverage futile exploration and optimize successful trajectories. Our method does not require rule-based or manual rewards, and it enables the Robot Arm to quickly find feasible solutions through experience relabeling, while unnecessary explorations are avoided. We train the FFC assembly policy in a simulation environment and transfer it to the real scenario by using domain adaptation. We explore various combinations of hindsight and self-imitation learning, and discuss the results comprehensively. Experimental findings demonstrate that our model achieves fast and advanced flexible flat cable assembly, surpassing other reinforcement learning-based methods.Note to Practitioners-The motivation of this article stems from the need to develop an efficient and accurate FFC assembly policy for 3C (Computer, Communication, and Consumer Electronic) industry, promoting the development of intelligent manufacturing. Traditional control methods are incompetent to complete such a high-precision task with Robot Arm due to the difficult-to-model connectors, and existing reinforcement learning methods cannot converge with restricted epochs because of the difficult goals or trajectories. To quickly learn a high-quality assembly for Robot Arm and accelerate the convergence speed, we combine the goal-conditioned reinforcement learning and self-imitation mechanism, balancing the depth and breadth of exploration. The proposal takes visual information and six-dimensions force as state, obtaining satisfactory assembly policies. We build a simulation scene by the Pybullet platform and pre-train the Robot Arm on it, and then the pre-trained policies can be reused in real scenarios with finetuning.
引用
收藏
页码:6217 / 6228
页数:12
相关论文
共 50 条
  • [1] Curriculum Goal-Conditioned Imitation for Offline Reinforcement Learning
    Feng, Xiaoyun
    Jiang, Li
    Yu, Xudong
    Xu, Haoran
    Sun, Xiaoyan
    Wang, Jie
    Zhan, Xianyuan
    Chan, Wai Kin
    IEEE TRANSACTIONS ON GAMES, 2024, 16 (01) : 102 - 112
  • [2] Self-imitation guided goal-conditioned reinforcement learning
    Li, Yao
    Wang, Yuhui
    Tan, Xiaoyang
    PATTERN RECOGNITION, 2023, 144
  • [3] Goal-Conditioned Dual-Action Imitation Learning for Dexterous Dual-Arm Robot Manipulation
    Kim, Heecheol
    Ohmura, Yoshiyuki
    Kuniyoshi, Yasuo
    IEEE TRANSACTIONS ON ROBOTICS, 2024, 40 : 2287 - 2305
  • [4] Learn Goal-Conditioned Policy with Intrinsic Motivation for Deep Reinforcement Learning
    Liu, Jinxin
    Wang, Donglin
    Tian, Qiangxing
    Chen, Zhengyu
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7558 - 7566
  • [5] Goal-Conditioned Reinforcement Learning within a Human-Robot Disassembly Environment
    Elguea-Aguinaco, Inigo
    Serrano-Munoz, Antonio
    Chrysostomou, Dimitrios
    Inziarte-Hidalgo, Ibai
    Bogh, Simon
    Arana-Arexolaleiba, Nestor
    APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [6] Deep Reinforcement Learning Based on Local GNN for Goal-Conditioned Deformable Object Rearranging
    Deng, Yuhong
    Xia, Chongkun
    Wang, Xueqian
    Chen, Lipeng
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 1131 - 1138
  • [7] A Controllable Agent by Subgoals in Path Planning Using Goal-Conditioned Reinforcement Learning
    Lee, Gyeong Taek
    Kim, Kangjin
    IEEE ACCESS, 2023, 11 : 33812 - 33825
  • [8] Robotic Control in Adversarial and Sparse Reward Environments: A Robust Goal-Conditioned Reinforcement Learning Approach
    He X.
    Lv C.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (01): : 244 - 253
  • [9] Mastering the Complex Assembly Task With a Dual-Arm Robot Based on Deep Reinforcement Learning: A Novel Reinforcement Learning Method
    Jiang, Daqi
    Wang, Hong
    Lu, Yanzheng
    IEEE ROBOTICS & AUTOMATION MAGAZINE, 2023, 30 (02) : 57 - 66
  • [10] Robot skill acquisition for precision assembly of flexible flat cable with force control
    Song, Xiaogang
    Xu, Peng
    Xu, Wenfu
    Li, Bing
    Qin, Lei
    ROBOTICA, 2024, 42 (09) : 2908 - 2923