Robust hierarchical games of linear discrete-time systems based on off-policy model-free reinforcement learning

被引:2
作者
Ma, Xiao [1 ]
Yuan, Yuan [1 ]
机构
[1] Northwestern Polytech Univ, Sch Astronaut, Xian 710072, Peoples R China
来源
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS | 2024年 / 361卷 / 07期
关键词
Robust hierarchical game; Reinforcement learning; Model-free; Off-policy;
D O I
10.1016/j.jfranklin.2024.106711
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An off -policy model -free reinforcement learning (RL) algorithm is proposed for a robust hierarchical game while considering incomplete information and input constraints. The robust hierarchical game exhibits characteristics of a Stackelberg-Nash (SN) game, where equilibrium points are designated as Stackelberg-Nash-Saddle equilibrium (SNE) points. An off -policy method is employed for the RL algorithm, addressing input constraints by using excitation input instead of real-time update polices as control inputs. Moreover, a model -free method is implemented for the off -policy RL algorithm, accounting for the challenge posed by incomplete information. The goal of this paper is to develop an off -policy model -free RL algorithm to obtain approximate SNE polices of the robust hierarchical game with incomplete information and input constraints. Furthermore, the convergence and effectiveness of the off -policy model -free RL algorithm are guaranteed by proving the equivalence of Bellman equation between nominal SNE policies and approximate SNE policies. Finally, a simulation is provided to verify the advantage of the developed algorithm.
引用
收藏
页数:16
相关论文
共 26 条
[1]  
Bravo L, 2020, J FRANKLIN I, V357, P5773
[2]   Hierarchical game for integrated energy system and electricity-hydrogen hybrid charging station under distributionally robust optimization [J].
Cai, Pengcheng ;
Mi, Yang ;
Ma, Siyuan ;
Li, Hongzhong ;
Li, Dongdong ;
Wang, Peng .
ENERGY, 2023, 283
[3]   Reinforcement learning based model-free optimized trajectory tracking strategy design for an AUV [J].
Duan, Kairong ;
Fong, Simon ;
Chen, C. L. Philip .
NEUROCOMPUTING, 2022, 469 :289-297
[4]   Exploiting Generalization in the Subspaces for Faster Model-Based Reinforcement Learning [J].
Hashemzadeh, Maryam ;
Hosseini, Reshad ;
Ahmadabadi, Majid Nili .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (06) :1635-1650
[5]   A model-free distributed cooperative frequency control strategy for MT-HVDC systems using reinforcement learning method [J].
Hu, Zhong-Jie ;
Liu, Zhi-Wei ;
Li, Chaojie ;
Huang, Tingwen ;
Hu, Xiong .
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2021, 358 (13) :6490-6507
[6]   Reliable Distributed Computing for Metaverse: A Hierarchical Game-Theoretic Approach [J].
Jiang, Yuna ;
Kang, Jiawen ;
Niyato, Dusit ;
Ge, Xiaohu ;
Xiong, Zehui ;
Miao, Chunyan ;
Shen, Xuemin .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (01) :1084-1100
[7]   Discrete-Time Robust Hierarchical Linear-Quadratic Dynamic Games [J].
Kebriaei, Hamed ;
Iannelli, Luigi .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (03) :902-909
[8]   H∞ control of linear discrete-time systems: Off-policy reinforcement learning [J].
Kiumarsi, Bahare ;
Lewis, Frank L. ;
Jiang, Zhong-Ping .
AUTOMATICA, 2017, 78 :144-152
[9]   Multiplayer Stackelberg-Nash Game for Nonlinear System via Value Iteration-Based Integral Reinforcement Learning [J].
Li, Man ;
Qin, Jiahu ;
Freris, Nikolaos M. ;
Ho, Daniel W. C. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) :1429-1440
[10]   Neuro-adaptive control for searching generalized Nash equilibrium of multi-agent games: A two-stage design approach [J].
Meng, Qing ;
Nian, Xiaohong ;
Chen, Yong ;
Chen, Zhao .
NEUROCOMPUTING, 2023, 530 :69-80