Model-free optimal tracking policies for Markov jump systems by solving non-zero-sum games

被引:3
|
作者
Zhou, Peixin [1 ]
Xue, Huiwen [1 ]
Wen, Jiwei [1 ]
Shi, Peng [2 ,3 ]
Luan, Xaoli [1 ]
机构
[1] Jiangnan Univ, Sch Internet Things Engn, Key Lab Adv Proc Control Light Ind, Minist Educ, Wuxi 214122, Peoples R China
[2] Univ Adelaide, Sch Elect & Mech Engn, Adelaide, SA 5005, Australia
[3] Obuda Univ, Res & Innovat Ctr, H-1034 Budapest, Hungary
基金
中国国家自然科学基金;
关键词
Value iteration algorithm; Influence function; Adaptive optimal tracking; Non-zero-sum game; Nash equilibrium;
D O I
10.1016/j.ins.2023.119423
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper develops model-free optimal tracking policies for Markov jump systems by solving nonzero-sum games (NZSGs). First, coupled action and mode-dependent value functions (CAMDVFs) are built for solving a two-player NZSG and getting Nash equilibrium solutions. Second, we propose a value iteration (VI) algorithm to parallelly update policies under each mode by collecting data on different operation modes within each iterative window. Moreover, the iterative increasing convergence of the CAMDVFs is proved by introducing auxiliary functions between two adjacent iterations. It is worth pointing out that an influence function is introduced to remove abnormal data to improve the learning capability of the VI algorithm effectively. Finally, the tracking policies' validity, self-adaptability and application potential are verified by a numerical example and a generalized economic model.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Model-Free Learning of Optimal Ergodic Policies in Wireless Systems
    Kalogerias, Dionysios S.
    Eisen, Mark
    Pappas, George J.
    Ribeiro, Alejandro
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2020, 68 : 6272 - 6286
  • [22] Model-Free Frequency Control of Power Systems With Unknown Markov Jump Parameters
    Huo, Shicheng
    Wang, Zhipeng
    Liu, Guobao
    Li, Feng
    Shen, Hao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (12) : 4934 - 4938
  • [24] Zero-sum Markov games and worst-case optimal control of queueing systems
    Altman, E
    Hordijk, A
    QUEUEING SYSTEMS, 1995, 21 (3-4) : 415 - 447
  • [25] A Sufficient Condition for Linear-Quadratic Stochastic Zero-Sum Differential Games for Markov Jump Systems
    Moon, Jun
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (04) : 1619 - 1626
  • [26] Stochastic Zero-Sum Differential Games and H∞ Control of Discrete-time Markov Jump Systems
    Zhou Haiying
    Zhu Huainian
    Zhang Chengke
    26TH CHINESE CONTROL AND DECISION CONFERENCE (2014 CCDC), 2014, : 151 - 156
  • [27] Optimal control and non-zero-sum differential game for Hurwicz model considering uncertain dynamic systems with multiple input delays
    Li, Xi
    Song, Qiankun
    Liu, Yurong
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2023, 54 (08) : 1676 - 1693
  • [28] Model-free finite-horizon optimal control of discrete-time two-player zero-sum games
    Wang, Wei
    Chen, Xin
    Du, Jianhua
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2023, 54 (01) : 167 - 179
  • [29] On the Use of Non-Stationary Strategies for Solving Two-Player Zero-Sum Markov Games
    Perolat, Julien
    Piot, Bilal
    Scherrer, Bruno
    Pietquin, Olivier
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 893 - 901
  • [30] Online event-triggered adaptive critic design for non-zero-sum games of partially unknown networked systems
    Su, Hanguang
    Zhang, Huaguang
    Liang, Yuling
    Mu, Yunfei
    NEUROCOMPUTING, 2019, 368 : 84 - 98