Learning-Based Bounded Synthesis for Semi-MDPs With LTL Specifications

被引：0

作者：

Oura, Ryohei ^{[1
]}

Ushio, Toshimitsu ^{[1
]}

机构：

[1] Osaka Univ, Grad Sch Engn Sci, Toyonaka, Osaka 5608531, Japan

来源：

IEEE CONTROL SYSTEMS LETTERS | 2022年 / 6卷

基金：

日本科学技术振兴机构;

关键词：

Bounded synthesis; linear temporal logic; reinforcement learning; Bayesian inference; semi-Markov decision process;

D O I：

10.1109/LCSYS.2022.3169982

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This letter proposes a learning-based bounded synthesis for a semi-Markov decision process (SMDP) with a linear temporal logic (LTL) specification. In the product of the SMDP and the deterministic K-co-Buchi automaton (dKcBA) converted from the LTL specification, we learn both the winning region of satisfying the LTL specification and the dynamics therein based on reinforcement learning and Bayesian inference. Then, we synthesize an optimal policy satisfying the following two conditions. (1) It maximizes the probability of reaching the wining region. (2) It minimizes a long-term risk for the dwell time within the winning region. The minimization of the long-term risk is done based on the estimated dynamics and a value iteration. We show that, if the discount factor is sufficiently close to one, the synthesized policy converges to the optimal policy as the number of the data obtained by the exploration goes to the infinity.

引用

页码：2557 / 2562

页数：6

共 50 条

[1] Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
Sutton, RS
Precup, D
Singh, S
ARTIFICIAL INTELLIGENCE, 1999, 112 (1-2) : 181 - 211
[2] Time-Bounded Mission Planning in Time-Varying Domains with Semi-MDPs and Gaussian Processes
Duckworth, Paul
Lacerda, Bruno
Hawes, Nick
CONFERENCE ON ROBOT LEARNING, VOL 155, 2020, 155 : 1654 - 1668
[3] Bounded Synthesis and Reinforcement Learning of Supervisors for Stochastic Discrete Event Systems With LTL Specifications
Oura, Ryohei
Ushio, Toshimitsu
Sakakibara, Ami
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (10) : 6668 - 6683
[4] Zonotope-based Controller Synthesis for LTL Specifications
Ren, Wei
Calbert, Julien
Jungers, Raphael
2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 580 - 585
[5] Synthesis of Output Feedback Control for Motion Planning Based on LTL Specifications
Wu, Min
Yan, Gangfeng
Lin, Zhiyun
Lan, Ying
2009 IEEE-RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, 2009, : 5071 - 5075
[6] Automatic synthesis of multi-agent motion tasks based on LTL specifications
Loizou, SG
Kyriakopoulos, KJ
2004 43RD IEEE CONFERENCE ON DECISION AND CONTROL (CDC), VOLS 1-5, 2004, : 153 - 158
[7] Learning-Based Probabilistic LTL Motion Planning With Environment and Motion Uncertainties
Cai, Mingyu
Peng, Hao
Li, Zhijun
Kan, Zhen
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (05) : 2386 - 2392
[8] Deep Learning-Enhanced Sampling-Based Path Planning for LTL Mission Specifications
Baek, Changmin
Cho, Kyunghoon
SENSORS, 2024, 24 (10)
[9] A Machine Learning-Based Approach for Demarcating Requirements in Textual Specifications
Abualhaija, Sallam
Arora, Chetan
Sabetzadeh, Mehrdad
Briand, Lionel C.
Vaz, Eduardo
2019 27TH IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE (RE 2019), 2019, : 51 - 62
[10] Dual Learning-Based Safe Semi-Supervised Learning
Gan, Haitao
Li, Zhenhua
Fan, Yingle
Luo, Zhizeng
IEEE ACCESS, 2018, 6 : 2615 - 2621

← 1 2 3 4 5 →