Reinforcement learning-based optimal control for Markov jump systems with completely unknown dynamics

被引:6
|
作者
Shi, Xiongtao [1 ,2 ]
Li, Yanjie [1 ,2 ]
Du, Chenglong [3 ]
Chen, Chaoyang [4 ]
Zong, Guangdeng [5 ]
Gui, Weihua [3 ]
机构
[1] Harbin Inst Technol Shenzhen, Guangdong Key Lab Intelligent Morphing Mech & Adap, Shenzhen 518055, Peoples R China
[2] Harbin Inst Technol Shenzhen, Sch Mech Engn & Automat, Shenzhen 518055, Peoples R China
[3] Cent South Univ, Sch Automat, Changsha 410083, Peoples R China
[4] Hunan Univ Sci & Technol, Sch Informat & Elect Engn, Xiangtan 411201, Peoples R China
[5] Tiangong Univ, Sch Control Sci & Engn, Tianjin 300387, Peoples R China
关键词
Markov jump systems; Optimal control; Coupled algebraic Riccati equation; Parallel policy iteration; Reinforcement learning; ADAPTIVE OPTIMAL-CONTROL; TRACKING CONTROL; LINEAR-SYSTEMS;
D O I
10.1016/j.automatica.2024.111886
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, the optimal control problem of a class of unknown Markov jump systems (MJSs) is investigated via the parallel policy iteration-based reinforcement learning (PPI-RL) algorithms. First, by solving the linear parallel Lyapunov equation, a model-based PPI-RL algorithm is studied to learn the solution of nonlinear coupled algebraic Riccati equation (CARE) of MJSs with known dynamics, thereby updating the optimal control gain. Then, a novel partially model-free PPI-RL algorithm is proposed for the scenario that the dynamics of the MJS is partially unknown, in which the optimal solution of CARE is learned via the mixed input-output data of all modes. Furthermore, for the MJS with completely unknown dynamics, a completely model-free PPI-RL algorithm is developed to get the optimal control gain by removing the dependence of model information in the process of solving the optimal solution of CARE. It is proved that the proposed PPI-RL algorithms converge to the unique optimal solution of CARE for MJSs with known, partially unknown, and completely unknown dynamics, respectively. Finally, simulation results are illustrated to show the feasibility and effectiveness of the PPI-RL algorithms.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Distributed Tracking Control of Completely Unknown Heterogeneous MASs Based on Reinforcement Learning
    Wang, Zhipeng
    Huo, Shicheng
    2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024, 2024, : 587 - 592
  • [42] Robust control scheme for a class of uncertain nonlinear systems with completely unknown dynamics using data-driven reinforcement learning method
    Jiang, He
    Zhang, Huaguang
    Cui, Yang
    Xiao, Geyang
    NEUROCOMPUTING, 2018, 273 : 68 - 77
  • [43] Non-zero-sum games of discrete-time Markov jump systems with unknown dynamics: An off-policy reinforcement learning method
    Zhang, Xuewen
    Shen, Hao
    Li, Feng
    Wang, Jing
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2024, 34 (02) : 949 - 968
  • [44] Optimal robust formation control for heterogeneous multi-agent systems based on reinforcement learning
    Yan, Bing
    Shi, Peng
    Lim, Cheng-Chew
    Shi, Zhiyuan
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2022, 32 (05) : 2683 - 2704
  • [45] Reinforcement Learning-Based Control for Nonlinear Discrete-Time Systems with Unknown Control Directions and Control Constraints
    Huang, Miao
    Liu, Cong
    He, Xiaoqi
    Ma, Longhua
    Lu, Zheming
    Su, Hongye
    NEUROCOMPUTING, 2020, 402 : 50 - 65
  • [46] Adaptive optimization algorithm for nonlinear Markov jump systems with partial unknown dynamics
    Fang, Haiyang
    Zhu, Guozheng
    Stojanovic, Vladimir
    Nie, Rong
    He, Shuping
    Luan, Xiaoli
    Liu, Fei
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2021, 31 (06) : 2126 - 2140
  • [47] A neural network based online learning and control approach for Markov jump systems
    Zhong, Xiangnan
    He, Haibo
    Zhang, Huaguang
    Wang, Zhanshan
    NEUROCOMPUTING, 2015, 149 : 116 - 123
  • [48] Reinforcement Learning for H∞ Optimal Control of Unknown Continuous-Time Linear Systems
    Li, Hongyang
    Wei, Qinglai
    Tan, Xiangmin
    IEEE TRANSACTIONS ON CYBERNETICS, 2025, 55 (05) : 2379 - 2389
  • [49] H∞ Control for Interconnected Systems With Unknown System Dynamics: A Two-Stage Reinforcement Learning Method
    Liu, Jinxu
    Shen, Hao
    Wang, Jing
    Cao, Jinde
    Rutkowski, Leszek
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025, 22 : 6388 - 6397
  • [50] Integral Reinforcement Learning-Based Adaptive NN Control for Continuous-Time Nonlinear MIMO Systems With Unknown Control Directions
    Guo, Xinxin
    Yan, Weisheng
    Cui, Rongxin
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (11): : 4068 - 4077