Deep Reinforcement Learning for Concentric Tube Robot Control with a Goal-Based Curriculum

被引:4
|
作者
Iyengar, Keshav [1 ]
Stoyanov, Danail [1 ]
机构
[1] UCL, Wellcome EPSRC Ctr Intervent & Surg Sci WEISS, London W1W 7EJ, England
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1109/ICRA48506.2021.9561620
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Concentric Tube Robots (CTRs), a type of continuum robot, are a collection of concentric, pre-curved tubes composed of super elastic nickel titanium alloy. CTRs can bend and twist from the interactions between neighboring tubes causing the kinematics and therefore control of the end-effector to be very challenging to model. In this paper, we develop a control scheme for a CTR end-effector in Cartesian space with no prior kinematic model using a deep reinforcement learning (DRL) approach with a goal-based curriculum reward strategy. We explore the use of curricula by changing the goal tolerance through training with constant, linear and exponential decay functions. Also, relative and absolute joint representations as a way of improving training convergence are explored. Quantitative comparisons for combinations of curricula and joint representations are performed and the exponential decay relative approach is used for training a robust policy in a noise-induced simulation environment. Compared to a previous DRL approach, our new method reduces training time and employs a more complex simulation environment. We report mean Cartesian errors of 1.29 mm and a success rate of 0.93 with a relative decay curriculum. In path following, we report mean errors of 1.37 mm in a noise-induced path following task. Albeit in simulation, these results indicate the promise of using DRL in model free control of continuum robots and CTRs in particular.
引用
收藏
页码:1459 / 1465
页数:7
相关论文
共 50 条
  • [31] Humanoid robot control based on reinforcement learning
    Iida, S. (iida@ics.nitech.ac.jp), IEEE Robotics and Automation Society; Nagoya University, Japan; City of Nagoya, Japan; Nagoya City Science Museum; Chubu Science and Technology Center (Institute of Electrical and Electronics Engineers Inc.):
  • [32] Research on Robot Control Based on Reinforcement Learning
    Liu, Gang
    CYBER SECURITY INTELLIGENCE AND ANALYTICS, 2020, 928 : 136 - 141
  • [33] Growing Robot Navigation Based on Deep Reinforcement Learning
    Ataka, Ahmad
    Sandiwan, Andreas P.
    2023 9TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS, ICCAR, 2023, : 115 - 120
  • [34] Robot path planning based on deep reinforcement learning
    Long, Yinxin
    He, Huajin
    2020 IEEE CONFERENCE ON TELECOMMUNICATIONS, OPTICS AND COMPUTER SCIENCE (TOCS), 2020, : 151 - 154
  • [35] Robot Path Planning Based on Deep Reinforcement Learning
    Zhang, Rui
    Jiang, Yuhao
    Wu Fenghua
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 1697 - 1701
  • [36] Mobile Robot Navigation based on Deep Reinforcement Learning
    Ruan, Xiaogang
    Ren, Dingqi
    Zhu, Xiaoqing
    Huang, Jing
    PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 6174 - 6178
  • [37] Sensor Fusion for Robot Control through Deep Reinforcement Learning
    Bohez, Steven
    Verbelen, Tim
    De Coninck, Elias
    Vankeirsbilck, Bert
    Simoens, Pieter
    Dhoedt, Bart
    2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 2365 - 2370
  • [38] Deep reinforcement learning method for biped robot gait control
    Feng C.
    Zhang Y.
    Huang C.
    Jiang W.
    Wu Z.
    1600, CIMS (27): : 2341 - 2349
  • [39] Deep Reinforcement Learning for Robot Batching Optimization and Flow Control
    Hildebrand, Max
    Andersen, Rasmus S.
    Bogh, Simon
    30TH INTERNATIONAL CONFERENCE ON FLEXIBLE AUTOMATION AND INTELLIGENT MANUFACTURING (FAIM2021), 2020, 51 : 1462 - 1468
  • [40] Position Control of a Mobile Robot through Deep Reinforcement Learning
    Quiroga, Francisco
    Hermosilla, Gabriel
    Farias, Gonzalo
    Fabregas, Ernesto
    Montenegro, Guelis
    APPLIED SCIENCES-BASEL, 2022, 12 (14):