Model-free optimal tracking policies for Markov jump systems by solving non-zero-sum games

被引:3
|
作者
Zhou, Peixin [1 ]
Xue, Huiwen [1 ]
Wen, Jiwei [1 ]
Shi, Peng [2 ,3 ]
Luan, Xaoli [1 ]
机构
[1] Jiangnan Univ, Sch Internet Things Engn, Key Lab Adv Proc Control Light Ind, Minist Educ, Wuxi 214122, Peoples R China
[2] Univ Adelaide, Sch Elect & Mech Engn, Adelaide, SA 5005, Australia
[3] Obuda Univ, Res & Innovat Ctr, H-1034 Budapest, Hungary
基金
中国国家自然科学基金;
关键词
Value iteration algorithm; Influence function; Adaptive optimal tracking; Non-zero-sum game; Nash equilibrium;
D O I
10.1016/j.ins.2023.119423
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper develops model-free optimal tracking policies for Markov jump systems by solving nonzero-sum games (NZSGs). First, coupled action and mode-dependent value functions (CAMDVFs) are built for solving a two-player NZSG and getting Nash equilibrium solutions. Second, we propose a value iteration (VI) algorithm to parallelly update policies under each mode by collecting data on different operation modes within each iterative window. Moreover, the iterative increasing convergence of the CAMDVFs is proved by introducing auxiliary functions between two adjacent iterations. It is worth pointing out that an influence function is introduced to remove abnormal data to improve the learning capability of the VI algorithm effectively. Finally, the tracking policies' validity, self-adaptability and application potential are verified by a numerical example and a generalized economic model.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Model-Free Temporal Difference Learning for Non-Zero-Sum Games
    Wang, Liming
    Yang, Yongliang
    Ding, Dawei
    Yin, Yixin
    Guo, Zhishan
    Wunsch, Donald C.
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [2] Model-free adaptive optimal control of continuous-time nonlinear non-zero-sum games based on reinforcement learning
    Guo, Lei
    Zhao, Han
    IET CONTROL THEORY AND APPLICATIONS, 2023, 17 (02): : 223 - 239
  • [3] Off-Policy Model-Free Learning for Multi-Player Non-Zero-Sum Games With Constrained Inputs
    Huo, Yu
    Wang, Ding
    Qiao, Junfei
    Li, Menghua
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (02) : 910 - 920
  • [4] An efficient model-free adaptive optimal control of continuous-time nonlinear non-zero-sum games based on integral reinforcement learning with exploration
    Guo, Lei
    Xiong, Wenbo
    Song, Yuan
    Gan, Dongming
    IET CONTROL THEORY AND APPLICATIONS, 2024, 18 (06): : 748 - 763
  • [5] Non-zero-sum games of discrete-time Markov jump systems with unknown dynamics: An off-policy reinforcement learning method
    Zhang, Xuewen
    Shen, Hao
    Li, Feng
    Wang, Jing
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2024, 34 (02) : 949 - 968
  • [6] Event-Triggered Optimal Tracking Control for Multiplayer Non-Zero-Sum Games of Nonlinear Systems via Concurrent Learning
    Qin, Yi
    Wang, Lijie
    2023 IEEE 12TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE, DDCLS, 2023, : 479 - 484
  • [7] Cassting: Synthesizing Complex Systems Using Non-Zero-Sum Games
    Markey, Nicolas
    ERCIM NEWS, 2014, (97): : 25 - 26
  • [8] Robust Tracking Control for Non-Zero-Sum Games of Continuous-Time Uncertain Nonlinear Systems
    Qin, Chunbin
    Shang, Ziyang
    Zhang, Zhongwei
    Zhang, Dehua
    Zhang, Jishi
    MATHEMATICS, 2022, 10 (11)
  • [9] Model-free tracking design for nonlinear zero-sum games with an improved utility function
    Wang, Ding
    Tang, Guohan
    Ren, Jin
    Zhao, Mingming
    Qiao, Junfei
    NONLINEAR DYNAMICS, 2025,
  • [10] Optimal tracking control for non-zero-sum games of linear discrete-time systems via off-policy reinforcement learning
    Wen, Yinlei
    Zhang, Huaguang
    Su, Hanguang
    Ren, He
    OPTIMAL CONTROL APPLICATIONS & METHODS, 2020, 41 (04): : 1233 - 1250