Exploiting Transformer in Sparse Reward Reinforcement Learning for Interpretable Temporal Logic Motion Planning

被引:5
作者
Zhang, Hao [1 ]
Wang, Hao [1 ]
Kan, Zhen [1 ]
机构
[1] Univ Sci & Technol China, Dept Automat, Hefei 230026, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Transformers; Robots; Reinforcement learning; Planning; Learning automata; Encoding; Linear temporal logic; motion planning; reinforcement learning;
D O I
10.1109/LRA.2023.3290511
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Automaton based approaches have enabled robots to perform various complex tasks. However, most existing automaton based algorithms highly rely on the manually customized representation of states for the considered task, limiting its applicability in deep reinforcement learning algorithms. To address this issue, by incorporating Transformer into reinforcement learning, we develop a Double-Transformer-guided Temporal Logic framework (T2TL) that exploits the structural feature of Transformer twice, i.e., first encoding the LTL instruction via the Transformer module for efficient understanding of task instructions during the training and then encoding the context variable via the Transformer again for improved task performance. Particularly, the LTL instruction is specified by co-safe LTL. As a semantics-preserving rewriting operation, LTL progression is exploited to decompose the complex task into learnable sub-goals, which not only converts non-Markovian reward decision processes to Markovian ones, but also improves the sampling efficiency by simultaneous learning of multiple sub-tasks. An environment-agnostic LTL pre-training scheme is further incorporated to facilitate the learning of the Transformer module resulting in an improved representation of LTL. The simulation results demonstrate the effectiveness of the T2TL framework.
引用
收藏
页码:4831 / 4838
页数:8
相关论文
共 50 条
  • [41] Autonomous Docking of Mobile Robots by Reinforcement Learning Tackling the Sparse Reward Problem
    Burgueno-Romer, A. M.
    Ruiz-Sarmiento, J. R.
    Gonzalez-Jimenez, J.
    ADVANCES IN COMPUTATIONAL INTELLIGENCE (IWANN 2021), PT II, 2021, 12862 : 392 - 403
  • [42] Deep Reinforcement Learning for an Anthropomorphic Robotic Arm Under Sparse Reward Tasks
    Cheng, Hao
    Duan, Feng
    Zheng, Haosi
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2021, PT II, 2021, 13014 : 79 - 89
  • [43] Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance
    Knox, W. Bradley
    Stone, Peter
    ARTIFICIAL INTELLIGENCE, 2015, 225 : 24 - 50
  • [44] Temporal Logic Motion Planning and Control With Probabilistic Satisfaction Guarantees
    Lahijanian, Morteza
    Andersson, Sean B.
    Belta, Calin
    IEEE TRANSACTIONS ON ROBOTICS, 2012, 28 (02) : 396 - 409
  • [45] Reinforcement-Learning-Based Path Planning: A Reward Function Strategy
    Jaramillo-Martinez, Ramon
    Chavero-Navarrete, Ernesto
    Ibarra-Perez, Teodoro
    APPLIED SCIENCES-BASEL, 2024, 14 (17):
  • [46] Motion Planning for Industrial Robots using Reinforcement Learning
    Meyes, Richard
    Tercan, Hasan
    Roggendorf, Simon
    Thiele, Thomas
    Buescher, Christian
    Obdenbusch, Markus
    Brecher, Christian
    Jeschke, Sabina
    Meisen, Tobias
    MANUFACTURING SYSTEMS 4.0, 2017, 63 : 107 - 112
  • [47] Curiosity driven reinforcement learning for motion planning on humanoids
    Frank, Mikhail
    Leitner, Juregen
    Stollenga, Marijn
    Foerster, Alexander
    Schmidhuber, Juergen
    FRONTIERS IN NEUROROBOTICS, 2014, 7 : 1 - 15
  • [48] Temporal logic motion planning for mobile robots
    Fainekos, GE
    Kress-Gazit, H
    Pappas, GJ
    2005 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), VOLS 1-4, 2005, : 2020 - 2025
  • [49] Temporal logic motion planning for dynamic robots
    Fainekos, Georgios E.
    Girard, Antoine
    Kress-Gazit, Hadas
    Pappas, George J.
    AUTOMATICA, 2009, 45 (02) : 343 - 352
  • [50] Fast Motion Planning in Dynamic Environments With Extended Predicate-Based Temporal Logic
    Chen, Ziyang
    Cai, Mingyu
    Zhou, Zhangli
    Li, Lin
    Kan, Zhen
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, : 5293 - 5307