Learning-Based Bounded Synthesis for Semi-MDPs With LTL Specifications

被引:0
|
作者
Oura, Ryohei [1 ]
Ushio, Toshimitsu [1 ]
机构
[1] Osaka Univ, Grad Sch Engn Sci, Toyonaka, Osaka 5608531, Japan
来源
IEEE CONTROL SYSTEMS LETTERS | 2022年 / 6卷
基金
日本科学技术振兴机构;
关键词
Bounded synthesis; linear temporal logic; reinforcement learning; Bayesian inference; semi-Markov decision process;
D O I
10.1109/LCSYS.2022.3169982
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This letter proposes a learning-based bounded synthesis for a semi-Markov decision process (SMDP) with a linear temporal logic (LTL) specification. In the product of the SMDP and the deterministic K-co-Buchi automaton (dKcBA) converted from the LTL specification, we learn both the winning region of satisfying the LTL specification and the dynamics therein based on reinforcement learning and Bayesian inference. Then, we synthesize an optimal policy satisfying the following two conditions. (1) It maximizes the probability of reaching the wining region. (2) It minimizes a long-term risk for the dwell time within the winning region. The minimization of the long-term risk is done based on the estimated dynamics and a value iteration. We show that, if the discount factor is sufficiently close to one, the synthesized policy converges to the optimal policy as the number of the data obtained by the exploration goes to the infinity.
引用
收藏
页码:2557 / 2562
页数:6
相关论文
共 50 条
  • [1] Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    Sutton, RS
    Precup, D
    Singh, S
    ARTIFICIAL INTELLIGENCE, 1999, 112 (1-2) : 181 - 211
  • [2] Time-Bounded Mission Planning in Time-Varying Domains with Semi-MDPs and Gaussian Processes
    Duckworth, Paul
    Lacerda, Bruno
    Hawes, Nick
    CONFERENCE ON ROBOT LEARNING, VOL 155, 2020, 155 : 1654 - 1668
  • [3] Bounded Synthesis and Reinforcement Learning of Supervisors for Stochastic Discrete Event Systems With LTL Specifications
    Oura, Ryohei
    Ushio, Toshimitsu
    Sakakibara, Ami
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (10) : 6668 - 6683
  • [4] Zonotope-based Controller Synthesis for LTL Specifications
    Ren, Wei
    Calbert, Julien
    Jungers, Raphael
    2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 580 - 585
  • [5] Synthesis of Output Feedback Control for Motion Planning Based on LTL Specifications
    Wu, Min
    Yan, Gangfeng
    Lin, Zhiyun
    Lan, Ying
    2009 IEEE-RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, 2009, : 5071 - 5075
  • [6] Automatic synthesis of multi-agent motion tasks based on LTL specifications
    Loizou, SG
    Kyriakopoulos, KJ
    2004 43RD IEEE CONFERENCE ON DECISION AND CONTROL (CDC), VOLS 1-5, 2004, : 153 - 158
  • [7] Learning-Based Probabilistic LTL Motion Planning With Environment and Motion Uncertainties
    Cai, Mingyu
    Peng, Hao
    Li, Zhijun
    Kan, Zhen
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (05) : 2386 - 2392
  • [8] Deep Learning-Enhanced Sampling-Based Path Planning for LTL Mission Specifications
    Baek, Changmin
    Cho, Kyunghoon
    SENSORS, 2024, 24 (10)
  • [9] A Machine Learning-Based Approach for Demarcating Requirements in Textual Specifications
    Abualhaija, Sallam
    Arora, Chetan
    Sabetzadeh, Mehrdad
    Briand, Lionel C.
    Vaz, Eduardo
    2019 27TH IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE (RE 2019), 2019, : 51 - 62
  • [10] Dual Learning-Based Safe Semi-Supervised Learning
    Gan, Haitao
    Li, Zhenhua
    Fan, Yingle
    Luo, Zhizeng
    IEEE ACCESS, 2018, 6 : 2615 - 2621