Learning-Based Probabilistic LTL Motion Planning With Environment and Motion Uncertainties

被引:33
作者
Cai, Mingyu [1 ]
Peng, Hao [2 ]
Li, Zhijun [3 ]
Kan, Zhen [3 ]
机构
[1] Univ Iowa, Dept Mech Engn, Iowa City, IA 52246 USA
[2] ApexAI Inc, Palo Alto, CA 94303 USA
[3] Univ Sci & Technol China, Dept Automat, Hefei 230052, Peoples R China
关键词
Uncertainty; Probabilistic logic; Task analysis; Planning; Learning automata; Markov processes; Autonomous agents; Linear temporal logic (LTL); Markov decision process (MDP); motion planning; reinforcement learning; MARKOV DECISION-PROCESSES; LOGIC; FRAMEWORK;
D O I
10.1109/TAC.2020.3006967
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article considers control synthesis of an autonomous agent with linear temporal logic (LTL) specifications subject to environment and motion uncertainties. Specifically, the probabilistic motion of the agent is modeled by a Markov decision process (MDP) with unknown transition probabilities. The operating environment is assumed to be partially known, where the desired LTL specifications might be partially infeasible. A relaxed product MDP is constructed that allows the agent to revise its motion plan without strictly following the desired LTL constraints. A utility function composed of violation cost and state rewards is developed. Rigorous analysis shows that, if there almost surely (i.e., with probability 1) exists a policy that satisfies the relaxed product MDP, any algorithm that optimizes the expected utility is guaranteed to find such a policy. A reinforcement learning-based approach is then developed to generate policies that fulfill the desired LTL specifications as much as possible by optimizing the expected discount utility of the relaxed product MDP.
引用
收藏
页码:2386 / 2392
页数:7
相关论文
共 33 条
  • [1] Aksaray D, 2016, IEEE DECIS CONTR P, P6565, DOI 10.1109/CDC.2016.7799279
  • [2] [Anonymous], 2002, LNCS
  • [3] Baier C, 2008, PRINCIPLES OF MODEL CHECKING, P1
  • [4] Brázdil T, 2014, LECT NOTES COMPUT SC, V8837, P98, DOI 10.1007/978-3-319-11936-6_8
  • [5] Control of noisy differential-drive vehicles from time-bounded temporal logic specifications
    Cizelj, Igor
    Belta, Calin
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2014, 33 (08) : 1112 - 1129
  • [6] Ding XC, 2011, IEEE DECIS CONTR P, P532, DOI 10.1109/CDC.2011.6161122
  • [7] Optimal Control of Markov Decision Processes With Linear Temporal Logic Constraints
    Ding, Xuchu
    Smith, Stephen L.
    Belta, Calin
    Rus, Daniela
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (05) : 1244 - 1257
  • [8] DURRETT R., 1999, Essentials of stochastic processes, V1
  • [9] A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems
    Fisac, Jaime F.
    Akametalu, Anayo K.
    Zeilinger, Melanie N.
    Kaynama, Shahab
    Gillula, Jeremy
    Tomlin, Claire J.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (07) : 2737 - 2752
  • [10] Fu Jie, 2014, P ROB SCI SYST ROB C