Evolving and Incremental Value Iteration Schemes for Nonlinear Discrete-Time Zero-Sum Games

被引:25
作者
Zhao, Mingming [1 ,2 ]
Wang, Ding [1 ,2 ]
Ha, Mingming [3 ]
Qiao, Junfei [1 ,2 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Computat Intelligence & Intellige, Beijing Lab Smart Environm Protect, Beijing 100124, Peoples R China
[2] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
[3] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Games; Game theory; Cost function; Mathematical models; Convergence; Iterative algorithms; Optimal control; Adaptive dynamic programming (ADP); evolving policy pair; incremental factor; incremental value iteration (VI); stability criterion; zero-sum games; DYNAMIC-PROGRAMMING ALGORITHM; STATE-FEEDBACK CONTROL; STABILITY ANALYSIS; POLICY ITERATIONS; ADAPTIVE-CONTROL; SYSTEMS;
D O I
10.1109/TCYB.2022.3198078
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, evolving and incremental value iteration (VI) frameworks are constructed to address the discrete-time zero-sum game problem. First, the evolving scheme means that the closed-loop system is regulated by using the evolving policy pair. During the control stage, we are committed to establishing the stability criterion in order to guarantee the availability of evolving policy pairs. Second, a novel incremental VI algorithm, which takes the historical information of the iterative process into account, is developed to solve the regulation and tracking problems for the nonlinear zero-sum game. Via introducing different incremental factors, it is highlighted that we can adjust the convergence rate of the iterative cost function sequence. Finally, two simulation examples, including linear and nonlinear systems, are conducted to demonstrate the performance and the validity of the proposed evolving and incremental VI schemes.
引用
收藏
页码:4487 / 4499
页数:13
相关论文
共 42 条
  • [1] Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation
    Abu-Khalaf, Murad
    Lewis, Frank L.
    Huang, Jie
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (12) : 1989 - 1995
  • [2] An Improved N-Step Value Gradient Learning Adaptive Dynamic Programming Algorithm for Online Learning
    Al-Dabooni, Seaar
    Wunsch, Donald C., II
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (04) : 1155 - 1169
  • [3] Adaptive critic designs for discrete-time zero-sum games with application to H∞ control
    Al-Tamimi, Asma
    Abu-Khalaf, Murad
    Lewis, Frank L.
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (01): : 240 - 247
  • [4] Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming
    Bertsekas, Dimitri P.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (03) : 500 - 509
  • [5] Data-driven game-based control of microsatellites for attitude takeover of target spacecraft with disturbance
    Chai, Yuan
    Luo, Jianjun
    Ma, Weihua
    [J]. ISA TRANSACTIONS, 2022, 119 : 93 - 105
  • [6] Event-Triggered Optimal Control for Temperature Field of Roller Kiln Based on Adaptive Dynamic Programming
    Chen, Ning
    Li, Binyan
    Luo, Biao
    Gui, Weihua
    Yang, Chunhua
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (05) : 2805 - 2817
  • [7] H∞ Codesign for Uncertain Nonlinear Control Systems Based on Policy Iteration Method
    Fan, Quan-Yong
    Wang, Dongsheng
    Xu, Bin
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (10) : 10101 - 10110
  • [8] Stability analysis of heuristic dynamic programming algorithm for nonlinear systems
    Feng, Tao
    Zhang, Huaguang
    Luo, Yanhong
    Zhang, Jilie
    [J]. NEUROCOMPUTING, 2015, 149 : 1461 - 1468
  • [9] Offline and Online Adaptive Critic Control Designs With Stability Guarantee Through Value Iteration
    Ha, Mingming
    Wang, Ding
    Liu, Derong
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (12) : 13262 - 13274
  • [10] Generalized value iteration for discounted optimal control with stability analysis
    Ha, Mingming
    Wang, Ding
    Liu, Derong
    [J]. SYSTEMS & CONTROL LETTERS, 2021, 147 (147)