Reinforcement learning-based optimal control for Markov jump systems with completely unknown dynamics

被引：6

作者：

Shi, Xiongtao ^{[1
,2
]}

Li, Yanjie ^{[1
,2
]}

Du, Chenglong ^{[3
]}

Chen, Chaoyang ^{[4
]}

Zong, Guangdeng ^{[5
]}

Gui, Weihua ^{[3
]}

机构：

[1] Harbin Inst Technol Shenzhen, Guangdong Key Lab Intelligent Morphing Mech & Adap, Shenzhen 518055, Peoples R China

[2] Harbin Inst Technol Shenzhen, Sch Mech Engn & Automat, Shenzhen 518055, Peoples R China

[3] Cent South Univ, Sch Automat, Changsha 410083, Peoples R China

[4] Hunan Univ Sci & Technol, Sch Informat & Elect Engn, Xiangtan 411201, Peoples R China

[5] Tiangong Univ, Sch Control Sci & Engn, Tianjin 300387, Peoples R China

来源：

AUTOMATICA | 2025年 / 171卷

关键词：

Markov jump systems; Optimal control; Coupled algebraic Riccati equation; Parallel policy iteration; Reinforcement learning; ADAPTIVE OPTIMAL-CONTROL; TRACKING CONTROL; LINEAR-SYSTEMS;

D O I：

10.1016/j.automatica.2024.111886

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, the optimal control problem of a class of unknown Markov jump systems (MJSs) is investigated via the parallel policy iteration-based reinforcement learning (PPI-RL) algorithms. First, by solving the linear parallel Lyapunov equation, a model-based PPI-RL algorithm is studied to learn the solution of nonlinear coupled algebraic Riccati equation (CARE) of MJSs with known dynamics, thereby updating the optimal control gain. Then, a novel partially model-free PPI-RL algorithm is proposed for the scenario that the dynamics of the MJS is partially unknown, in which the optimal solution of CARE is learned via the mixed input-output data of all modes. Furthermore, for the MJS with completely unknown dynamics, a completely model-free PPI-RL algorithm is developed to get the optimal control gain by removing the dependence of model information in the process of solving the optimal solution of CARE. It is proved that the proposed PPI-RL algorithms converge to the unique optimal solution of CARE for MJSs with known, partially unknown, and completely unknown dynamics, respectively. Finally, simulation results are illustrated to show the feasibility and effectiveness of the PPI-RL algorithms.

引用

页数：8

共 50 条

[31] Optimal Event-Triggered H∞ Control for Nonlinear Systems with Completely Unknown Dynamics
Chu, Kun
Peng, Zhinan
Zhang, Zhiquan
Huang, Rui
Shi, Kecheng
Cheng, Hong
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 2236 - 2241
[32] A homotopy-based reinforcement learning scheme to optimal control for Markov switched interconnected systems
Liu, Jinxu
Mi, Xuanrui
Xia, Jianwei
Su, Lei
Shen, Hao
JOURNAL OF CONTROL AND DECISION, 2024,
[33] Robust Learning-Based Predictive Control for Discrete-Time Nonlinear Systems With Unknown Dynamics and State Constraints
Zhang, Xinglong
Liu, Jiahang
Xu, Xin
Yu, Shuyou
Chen, Hong
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (12): : 7314 - 7327
[34] Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics
Kiumarsi, Bahare
Lewis, Frank L.
Modares, Hamidreza
Karimpour, Ali
Naghibi-Sistani, Mohammad-Bagher
AUTOMATICA, 2014, 50 (04) : 1167 - 1175
[35] Stability and Fuzzy Optimal Control for Nonlinear Ito Stochastic Markov Jump Systems via Hybrid Reinforcement Learning
Pang, Zhen
Wang, Hai
Cheng, Jun
Tang, Shengda
Park, Ju H.
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2024, 32 (11) : 6472 - 6485
[36] Optimal Output Regulation of Linear Discrete-Time Systems With Unknown Dynamics Using Reinforcement Learning
Jiang, Yi
Kiumarsi, Bahare
Fan, Jialu
Chai, Tianyou
Li, Jinna
Lewis, Frank L.
IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (07) : 3147 - 3156
[37] Reinforcement learning-based robust optimal tracking control for disturbed nonlinear systems
Zhong-Xin Fan
Lintao Tang
Shihua Li
Rongjie Liu
Neural Computing and Applications, 2023, 35 : 23987 - 23996
[38] Optimized Formation Control Using Simplified Reinforcement Learning for a Class of Multiagent Systems With Unknown Dynamics
Wen, Guoxing
Chen, C. L. Philip
Li, Bin
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2020, 67 (09) : 7879 - 7888
[39] Optimal Reinforcement Learning-Based Control Algorithm for a Class of Nonlinear Macroeconomic Systems
Ding, Qing
Jahanshahi, Hadi
Wang, Ye
Bekiros, Stelios
Alassafi, Madini O.
MATHEMATICS, 2022, 10 (03)
[40] Adaptive Optimal Control of Nonlinear Active Suspension Systems with Completely Unknown Dynamics
Chen, Xin
Huang, Yingbo
Na, Jing
Gao, Guanbin
Zhao, Jun
PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 3524 - 3529

← 1 2 3 4 5 →