An efficient model-free adaptive optimal control of continuous-time nonlinear non-zero-sum games based on integral reinforcement learning with exploration

被引:0
|
作者
Guo, Lei [1 ]
Xiong, Wenbo [1 ]
Song, Yuan [1 ]
Gan, Dongming [2 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing, Peoples R China
[2] Purdue Univ, Sch Engn Technol, W Lafayette, IN USA
来源
IET CONTROL THEORY AND APPLICATIONS | 2024年 / 18卷 / 06期
基金
中国国家自然科学基金;
关键词
adaptive control; dynamic programming; game theory; optimal control; OPTIMAL TRACKING CONTROL; SYSTEMS;
D O I
10.1049/cth2.12610
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To reduce the learning time and space occupation, this study presents a novel model-free algorithm for obtaining the Nash equilibrium solution of continuous-time nonlinear non-zero-sum games. Based on the integral reinforcement learning method, a new integral HJ equation that can quickly and cooperatively determine the Nash equilibrium strategies of all players is proposed. By leveraging the neural network approximation and gradient descent method, simultaneous continuous-time adaptive tuning laws are provided for both critic and actor neural network weights. These laws facilitate the estimation of the optimal value function and optimal policy without requiring knowledge or identification of the system's dynamics. The closed-loop system stability and convergence of weights are guaranteed through the Lyapunov analysis. Additionally, the algorithm is enhanced to reduce the number of auxiliary NNs used in the critic. The simulation results for a two-player non-zero-sum game validate the effectiveness of the proposed algorithm.
引用
收藏
页码:748 / 763
页数:16
相关论文
共 50 条
  • [1] Model-free adaptive optimal control of continuous-time nonlinear non-zero-sum games based on reinforcement learning
    Guo, Lei
    Zhao, Han
    IET CONTROL THEORY AND APPLICATIONS, 2023, 17 (02): : 223 - 239
  • [2] Data-Driven Integral Reinforcement Learning for Continuous-Time Non-Zero-Sum Games
    Yang, Yongliang
    Wang, Liming
    Modares, Hamidreza
    Ding, Dawei
    Yin, Yixin
    Wunsch, Donald
    IEEE ACCESS, 2019, 7 : 82901 - 82912
  • [3] Model-Free Temporal Difference Learning for Non-Zero-Sum Games
    Wang, Liming
    Yang, Yongliang
    Ding, Dawei
    Yin, Yixin
    Guo, Zhishan
    Wunsch, Donald C.
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [4] Robust Tracking Control for Non-Zero-Sum Games of Continuous-Time Uncertain Nonlinear Systems
    Qin, Chunbin
    Shang, Ziyang
    Zhang, Zhongwei
    Zhang, Dehua
    Zhang, Jishi
    MATHEMATICS, 2022, 10 (11)
  • [5] Model-Free Adaptive Algorithm for Optimal Control of Continuous-Time Nonlinear System
    Zhu, Yuanheng
    Zhao, Dongbin
    2015 CHINESE AUTOMATION CONGRESS (CAC), 2015, : 1850 - 1855
  • [6] Integral reinforcement learning-based online adaptive event-triggered control for non-zero-sum games of partially unknown nonlinear systems
    Su, Hanguang
    Zhang, Huaguang
    Sun, Shaoxin
    Cai, Yuliang
    NEUROCOMPUTING, 2020, 377 : 243 - 255
  • [7] Model-Free Reinforcement Learning for Nonlinear Zero-Sum Games with Simultaneous Explorations
    Zhang, Qichao
    Zhao, Donghin
    Zhu, Yuanheng
    Chen, Xi
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 4533 - 4538
  • [8] Model-free optimal tracking policies for Markov jump systems by solving non-zero-sum games
    Zhou, Peixin
    Xue, Huiwen
    Wen, Jiwei
    Shi, Peng
    Luan, Xaoli
    INFORMATION SCIENCES, 2023, 647
  • [9] Integral Reinforcement Learning for Linear Continuous-Time Zero-Sum Games With Completely Unknown Dynamics
    Li, Hongliang
    Liu, Derong
    Wang, Ding
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2014, 11 (03) : 706 - 714
  • [10] Efficient Exploration in Continuous-time Model-based Reinforcement Learning
    Treven, Lenart
    Hubotter, Jonas
    Sukhija, Bhavya
    Dorfler, Florian
    Krause, Andreas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,