Robust hierarchical games of linear discrete-time systems based on off-policy model-free reinforcement learning

被引：2

作者：

Ma, Xiao ^{[1
]}

Yuan, Yuan ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Sch Astronaut, Xian 710072, Peoples R China

来源：

JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS | 2024年 / 361卷 / 07期

关键词：

Robust hierarchical game; Reinforcement learning; Model-free; Off-policy;

D O I：

10.1016/j.jfranklin.2024.106711

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

An off -policy model -free reinforcement learning (RL) algorithm is proposed for a robust hierarchical game while considering incomplete information and input constraints. The robust hierarchical game exhibits characteristics of a Stackelberg-Nash (SN) game, where equilibrium points are designated as Stackelberg-Nash-Saddle equilibrium (SNE) points. An off -policy method is employed for the RL algorithm, addressing input constraints by using excitation input instead of real-time update polices as control inputs. Moreover, a model -free method is implemented for the off -policy RL algorithm, accounting for the challenge posed by incomplete information. The goal of this paper is to develop an off -policy model -free RL algorithm to obtain approximate SNE polices of the robust hierarchical game with incomplete information and input constraints. Furthermore, the convergence and effectiveness of the off -policy model -free RL algorithm are guaranteed by proving the equivalence of Bellman equation between nominal SNE policies and approximate SNE policies. Finally, a simulation is provided to verify the advantage of the developed algorithm.

引用

页数：16

共 26 条

[1]

Bravo L, 2020, J FRANKLIN I, V357, P5773

[2] Hierarchical game for integrated energy system and electricity-hydrogen hybrid charging station under distributionally robust optimization [J].

Cai, Pengcheng ;

Mi, Yang ;

Ma, Siyuan ;

Li, Hongzhong ;

Li, Dongdong ;

Wang, Peng .

ENERGY, 2023, 283

[3] Reinforcement learning based model-free optimized trajectory tracking strategy design for an AUV [J].

Duan, Kairong ;

Fong, Simon ;

Chen, C. L. Philip .

NEUROCOMPUTING, 2022, 469 :289-297

[4] Exploiting Generalization in the Subspaces for Faster Model-Based Reinforcement Learning [J].

Hashemzadeh, Maryam ;

Hosseini, Reshad ;

Ahmadabadi, Majid Nili .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (06) :1635-1650

[5] A model-free distributed cooperative frequency control strategy for MT-HVDC systems using reinforcement learning method [J].

Hu, Zhong-Jie ;

Liu, Zhi-Wei ;

Li, Chaojie ;

Huang, Tingwen ;

Hu, Xiong .

JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2021, 358 (13) :6490-6507

[6] Reliable Distributed Computing for Metaverse: A Hierarchical Game-Theoretic Approach [J].

Jiang, Yuna ;

Kang, Jiawen ;

Niyato, Dusit ;

Ge, Xiaohu ;

Xiong, Zehui ;

Miao, Chunyan ;

Shen, Xuemin .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (01) :1084-1100

[7] Discrete-Time Robust Hierarchical Linear-Quadratic Dynamic Games [J].

Kebriaei, Hamed ;

Iannelli, Luigi .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (03) :902-909

[8] H∞ control of linear discrete-time systems: Off-policy reinforcement learning [J].

Kiumarsi, Bahare ;

Lewis, Frank L. ;

Jiang, Zhong-Ping .

AUTOMATICA, 2017, 78 :144-152

[9] Multiplayer Stackelberg-Nash Game for Nonlinear System via Value Iteration-Based Integral Reinforcement Learning [J].

Li, Man ;

Qin, Jiahu ;

Freris, Nikolaos M. ;

Ho, Daniel W. C. .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) :1429-1440

[10] Neuro-adaptive control for searching generalized Nash equilibrium of multi-agent games: A two-stage design approach [J].

Meng, Qing ;

Nian, Xiaohong ;

Chen, Yong ;

Chen, Zhao .

NEUROCOMPUTING, 2023, 530 :69-80

← 1 2 3 →