An efficient model-free adaptive optimal control of continuous-time nonlinear non-zero-sum games based on integral reinforcement learning with exploration

被引:0
|
作者
Guo, Lei [1 ]
Xiong, Wenbo [1 ]
Song, Yuan [1 ]
Gan, Dongming [2 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing, Peoples R China
[2] Purdue Univ, Sch Engn Technol, W Lafayette, IN USA
来源
IET CONTROL THEORY AND APPLICATIONS | 2024年 / 18卷 / 06期
基金
中国国家自然科学基金;
关键词
adaptive control; dynamic programming; game theory; optimal control; OPTIMAL TRACKING CONTROL; SYSTEMS;
D O I
10.1049/cth2.12610
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To reduce the learning time and space occupation, this study presents a novel model-free algorithm for obtaining the Nash equilibrium solution of continuous-time nonlinear non-zero-sum games. Based on the integral reinforcement learning method, a new integral HJ equation that can quickly and cooperatively determine the Nash equilibrium strategies of all players is proposed. By leveraging the neural network approximation and gradient descent method, simultaneous continuous-time adaptive tuning laws are provided for both critic and actor neural network weights. These laws facilitate the estimation of the optimal value function and optimal policy without requiring knowledge or identification of the system's dynamics. The closed-loop system stability and convergence of weights are guaranteed through the Lyapunov analysis. Additionally, the algorithm is enhanced to reduce the number of auxiliary NNs used in the critic. The simulation results for a two-player non-zero-sum game validate the effectiveness of the proposed algorithm.
引用
收藏
页码:748 / 763
页数:16
相关论文
共 50 条
  • [41] Model-free PAC Time-Optimal Control Synthesis with Reinforcement Learning
    Liu, Mengyu
    Lu, Pengyuan
    Chen, Xin
    Sokolsky, Oleg
    Lee, Insup
    Kong, Fanxin
    2024 22ND ACM-IEEE INTERNATIONAL SYMPOSIUM ON FORMAL METHODS AND MODELS FOR SYSTEM DESIGN, MEMOCODE 2024, 2024, : 34 - 45
  • [42] Model-Free Adaptive Optimal Control for Unknown Nonlinear Multiplayer Nonzero-Sum Game
    Wei, Qinglai
    Zhu, Liao
    Song, Ruizhuo
    Zhang, Pinjia
    Liu, Derong
    Xiao, Jun
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (02) : 879 - 892
  • [43] Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach
    Vamvoudakis, Kyriakos G.
    SYSTEMS & CONTROL LETTERS, 2017, 100 : 14 - 20
  • [44] Optimal Tracking Control of Partial Unknown Continuous-Time Systems Using Integral Reinforcement Learning
    Cheng, Weiran
    Xiao, Zhenfei
    Li, Jinna
    2020 35TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2020, : 308 - 311
  • [45] Model-Free Reinforcement Learning by Embedding an Auxiliary System for Optimal Control of Nonlinear Systems
    Xu, Zhenhui
    Shen, Tielong
    Cheng, Daizhan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) : 1520 - 1534
  • [46] Sampled-data model-free adaptive integral sliding mode control for nonlinear continuous-time networked control systems with fading channels and packet dropouts
    Chang, Lina
    Hou, Zhongsheng
    NEUROCOMPUTING, 2024, 589
  • [47] A Single-NN Iterative Adaptive Dynamic Programming Algorithm for Continuous-Time Nonlinear Zero-Sum Games
    Song, Ruizhuo
    Li, Junsong
    2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 2848 - 2853
  • [48] Model-free finite-horizon optimal control of discrete-time two-player zero-sum games
    Wang, Wei
    Chen, Xin
    Du, Jianhua
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2023, 54 (01) : 167 - 179
  • [49] Model-free learning adaptive control for nonlinear systems with multiple time delay
    Hu, Z.Q.
    Li, X.D.
    Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2001, 33 (02): : 261 - 264
  • [50] Online concurrent reinforcement learning algorithm to solve two-player zero-sum games for partially unknown nonlinear continuous-time systems
    Yasini, Sholeh
    Karimpour, Ali
    Sistani, Mohammad-Bagher Naghibi
    Modares, Hamidreza
    INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2015, 29 (04) : 473 - 493