Maximum Entropy Optimal Control of Continuous-Time Dynamical Systems

被引：7

作者：

Kim, Jeongho ^{[1
,2
]}

Yang, Insoon ^{[3
,4
]}

机构：

[1] Seoul Natl Univ, Seoul 08826, South Korea

[2] Korea Inst Adv Study, Seoul 02455, South Korea

[3] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul 08826, South Korea

[4] Seoul Natl Univ, Automat & Syst Res Inst, Seoul 08826, South Korea

来源：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL | 2023年 / 68卷 / 04期

基金：

新加坡国家研究基金会;

关键词：

Dynamic programming (DP); entropy; Hamilton-Jacobi-Bellman (HJB) equations; optimal control; viscosity solution; VISCOSITY SOLUTIONS; RELAXED CONTROLS; EQUATIONS; DIMENSIONALITY; ALGORITHM; CURSE;

D O I：

10.1109/TAC.2022.3168168

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Maximum entropy reinforcement learning methods have been successfully applied to a range of challenging sequential decision-making and control tasks. However, most of the existing techniques are designed for discrete-time systems although there has been a growing interest to handle physical processes evolving in continuous time. As a first step toward their extension to continuous-time systems, this article aims to study the theory of maximum entropy optimal control in continuous time. Applying the dynamic programming principle, we derive a novel class of Hamilton-Jacobi-Bellman (HJB) equations and prove that the optimal value function of the maximum entropy control problem corresponds to the unique viscosity solution of the HJB equation. We further show that the optimal control is uniquely characterized as Gaussian in the case of control-affine systems and that, for linear-quadratic problems, the HJB equation is reduced to a Riccati equation, which can be used to obtain an explicit expression of the optimal control. The results of our numerical experiments demonstrate the performance of our maximum entropy method in continuous-time optimal control and reinforcement learning problems.

引用

页码：2018 / 2033

页数：16

共 50 条

[21] Optimal Control of Nonlinear Continuous-Time Systems in Strict-Feedback Form
Zargarzadeh, Hassan
Dierks, Travis
Jagannathan, Sarangapani
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (10) : 2535 - 2549
[22] Optimal control for multi-stage and continuous-time linear singular systems
Shu, Yadong
Zhu, Yuanguo
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2018, 49 (07) : 1419 - 1434
[23] Optimal sliding mode preview repetitive control for continuous-time nonlinear systems
Lan, Yong-Hong
Wu, Jin-Yi
She, Jin-Hua
INTERNATIONAL JOURNAL OF CONTROL, 2023, 96 (10) : 2415 - 2424
[24] Singular Arcs in Optimal Control of Continuous-Time Bimodal Switched Linear Systems
Hara, Naoyuki
Konishi, Keiji
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (02) : 826 - 833
[25] Adaptive Event Triggered Optimal Control for Constrained Continuous-time Nonlinear Systems
Wang, Ping
Wang, Zhen
Ma, Qian
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2022, 20 (03) : 857 - 868
[26] OPTIMAL-CONTROL OF CONTINUOUS-TIME DETERMINISTIC SYSTEMS WITH INCOMPLETE DISCRETE FEEDBACK
PANTELEYEV, AV
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1990, 64 (03) : 557 - 571
[27] State-Space Solution to Spectral Entropy Analysis and Optimal State-Feedback Control for Continuous-Time Linear Systems
Boichenko, Victor A.
Belov, Alexey A.
Andrianova, Olga G.
MATHEMATICS, 2024, 12 (22)
[28] On Optimal Control Problems for Dynamical Systems in Real Time
Gabasov R.
Dmitruk N.M.
Kirillova F.M.
Journal of Mathematical Sciences, 2024, 279 (5) : 669 - 683
[29] DESIGN OF AN OPTIMAL PREVIEW CONTROLLER FOR CONTINUOUS-TIME SYSTEMS
Liao, Fucheng
Tang, Yuan Yan
Liu, Heping
Wang, Yunjian
INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2011, 9 (04) : 655 - 673
[30] Continuous-time inverse quadratic optimal control problem
Li, Yibei
Yao, Yu
Hu, Xiaoming
AUTOMATICA, 2020, 117

← 1 2 3 4 5 →