Evolving and Incremental Value Iteration Schemes for Nonlinear Discrete-Time Zero-Sum Games

被引：25

作者：

Zhao, Mingming ^{[1
,2
]}

Wang, Ding ^{[1
,2
]}

Ha, Mingming ^{[3
]}

Qiao, Junfei ^{[1
,2
]}

机构：

[1] Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Computat Intelligence & Intellige, Beijing Lab Smart Environm Protect, Beijing 100124, Peoples R China

[2] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China

[3] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2023年 / 53卷 / 07期

基金：

北京市自然科学基金; 中国国家自然科学基金;

关键词：

Games; Game theory; Cost function; Mathematical models; Convergence; Iterative algorithms; Optimal control; Adaptive dynamic programming (ADP); evolving policy pair; incremental factor; incremental value iteration (VI); stability criterion; zero-sum games; DYNAMIC-PROGRAMMING ALGORITHM; STATE-FEEDBACK CONTROL; STABILITY ANALYSIS; POLICY ITERATIONS; ADAPTIVE-CONTROL; SYSTEMS;

D O I：

10.1109/TCYB.2022.3198078

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this article, evolving and incremental value iteration (VI) frameworks are constructed to address the discrete-time zero-sum game problem. First, the evolving scheme means that the closed-loop system is regulated by using the evolving policy pair. During the control stage, we are committed to establishing the stability criterion in order to guarantee the availability of evolving policy pairs. Second, a novel incremental VI algorithm, which takes the historical information of the iterative process into account, is developed to solve the regulation and tracking problems for the nonlinear zero-sum game. Via introducing different incremental factors, it is highlighted that we can adjust the convergence rate of the iterative cost function sequence. Finally, two simulation examples, including linear and nonlinear systems, are conducted to demonstrate the performance and the validity of the proposed evolving and incremental VI schemes.

引用

页码：4487 / 4499

页数：13

共 42 条

[1] Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation
Abu-Khalaf, Murad
Lewis, Frank L.
Huang, Jie
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (12) : 1989 - 1995
[2] An Improved N-Step Value Gradient Learning Adaptive Dynamic Programming Algorithm for Online Learning
Al-Dabooni, Seaar
Wunsch, Donald C., II
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (04) : 1155 - 1169
[3] Adaptive critic designs for discrete-time zero-sum games with application to H∞ control
Al-Tamimi, Asma
Abu-Khalaf, Murad
Lewis, Frank L.
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (01): : 240 - 247
[4] Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming
Bertsekas, Dimitri P.
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (03) : 500 - 509
[5] Data-driven game-based control of microsatellites for attitude takeover of target spacecraft with disturbance
Chai, Yuan
Luo, Jianjun
Ma, Weihua
[J]. ISA TRANSACTIONS, 2022, 119 : 93 - 105
[6] Event-Triggered Optimal Control for Temperature Field of Roller Kiln Based on Adaptive Dynamic Programming
Chen, Ning
Li, Binyan
Luo, Biao
Gui, Weihua
Yang, Chunhua
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (05) : 2805 - 2817
[7] H∞ Codesign for Uncertain Nonlinear Control Systems Based on Policy Iteration Method
Fan, Quan-Yong
Wang, Dongsheng
Xu, Bin
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (10) : 10101 - 10110
[8] Stability analysis of heuristic dynamic programming algorithm for nonlinear systems
Feng, Tao
Zhang, Huaguang
Luo, Yanhong
Zhang, Jilie
[J]. NEUROCOMPUTING, 2015, 149 : 1461 - 1468
[9] Offline and Online Adaptive Critic Control Designs With Stability Guarantee Through Value Iteration
Ha, Mingming
Wang, Ding
Liu, Derong
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (12) : 13262 - 13274
[10] Generalized value iteration for discounted optimal control with stability analysis
Ha, Mingming
Wang, Ding
Liu, Derong
[J]. SYSTEMS & CONTROL LETTERS, 2021, 147 (147)

← 1 2 3 4 5 →