Synthesis of Controllers for Co-Safe Linear Temporal Logic Specifications using Reinforcement Learning

被引:0
作者
Ren, Xiaohua [1 ]
Yin, Xiang [1 ]
Li, Shaoyuan [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China
来源
2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC) | 2021年
基金
中国国家自然科学基金;
关键词
Reinforcement learning; Syntactically co-safe Linear Temporal Logics; Reward design;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, the interest in controller synthesis for complex tasks is rapidly growing [1, 2], and in most cases, environments are unknown, which limits applications of traditional control methods. In this paper, we use reinforcement learning method to learn to optimally achieve complex tasks under unknown environments. Specifically, we model the uncertain environments using the Markov Decision Processes (MDPs). The high-level control objective is described by the syntactically co-safe Linear Temporal Logics (scLTLs). Under such settings, we propose a new method for the reward design procedure. The proposed new reward function utilizes the information of automata which are induced from scLTL tasks. Furthermore, we compare the proposed reward function with existing approaches in the standard grid world environments. We show that, by using our reward function, the learning process converges faster and finally optimally achieves scLTL tasks.
引用
收藏
页码:2304 / 2309
页数:6
相关论文
共 19 条
  • [1] Approximate Abstractions of Stochastic Hybrid Systems
    Abate, Alessandro
    D'Innocenzo, Alessandro
    Di Benedetto, Maria D.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2011, 56 (11) : 2688 - 2694
  • [2] Baier C, 2008, PRINCIPLES OF MODEL CHECKING, P1
  • [3] Brockman Greg, 2016, OPENAI GYM
  • [4] Fu Jie, 2014, P ROB SCI SYST ROB C
  • [5] Hasanbeig M., 2018, ARXIV180108099
  • [6] Icarte RT, 2018, PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), P452
  • [7] Approximations of Stochastic Hybrid Systems
    Julius, A. Agung
    Pappas, George J.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2009, 54 (06) : 1193 - 1203
  • [8] Model checking of safety properties
    Kupferman, O
    Vardi, MY
    [J]. FORMAL METHODS IN SYSTEM DESIGN, 2001, 19 (03) : 291 - 314
  • [9] Formal Verification and Synthesis for Discrete-Time Stochastic Systems
    Lahijanian, Morteza
    Andersson, Sean B.
    Belta, Calin
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2015, 60 (08) : 2031 - 2045
  • [10] Li X, 2017, IEEE INT C INT ROBOT, P3834, DOI 10.1109/IROS.2017.8206234