Dichotomy value iteration with parallel learning design towards discrete-time zero-sum games

被引：3

作者：

Wang, Jiangyu ^{[1
,2
,3
,4
]}

Wang, Ding ^{[1
]}

Li, Xin ^{[1
,2
,3
,4
]}

Qiao, Junfei ^{[1
,2
,3
,4
]}

机构：

[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China

[2] Beijing Univ Technol, Key Lab Computat Intelligence & Intelligent Syst, Beijing 100124, Peoples R China

[3] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China

[4] Beijing Univ Technol, Beijing Lab Smart Environm Protect, Beijing 100124, Peoples R China

来源：

NEURAL NETWORKS | 2023年 / 167卷

基金：

中国国家自然科学基金;

关键词：

Adaptive critic; Artificial neural networks; Nonlinear systems; Parallel learning; Value iteration; Zero -sum games; ADAPTIVE CRITIC DESIGNS; STABILITY ANALYSIS;

D O I：

10.1016/j.neunet.2023.09.009

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, a novel parallel learning framework is developed to solve zero-sum games for discrete -time nonlinear systems. Briefly, the purpose of this study is to determine a tentative function according to the prior knowledge of the value iteration (VI) algorithm. The learning process of the parallel controllers can be guided by the tentative function. That is to say, the neighborhood of the optimal cost function can be compressed within a small range via two typical exploration policies. Based on the parallel learning framework, a novel dichotomy VI algorithm is established to accelerate the learning speed. It is shown that the parallel controllers will converge to the optimal policy from contrary initial policies. Finally, two typical systems are used to demonstrate the learning performance of the constructed dichotomy VI algorithm.(c) 2023 Elsevier Ltd. All rights reserved.

引用

页码：751 / 762

页数：12

共 50 条

[1] Evolving and Incremental Value Iteration Schemes for Nonlinear Discrete-Time Zero-Sum Games
Zhao, Mingming
Wang, Ding
Ha, Mingming
Qiao, Junfei
IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (07) : 4487 - 4499
[2] Adaptive Dynamic Programming for Discrete-Time Zero-Sum Games
Wei, Qinglai
Liu, Derong
Lin, Qiao
Song, Ruizhuo
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (04) : 957 - 969
[3] Discrete-time zero-sum Markov games with first passage criteria
Liu, Qiuli
Huang, Xiangxiang
OPTIMIZATION, 2017, 66 (04) : 571 - 587
[4] EXISTENCE OF VALUE AND RANDOMIZED STRATEGIES IN ZERO-SUM DISCRETE-TIME STOCHASTIC DYNAMIC-GAMES
KUMAR, PR
SHIAU, TH
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1981, 19 (05) : 617 - 634
[5] Heuristic Search Value Iteration for Zero-Sum Stochastic Games
Buffet, Olivier
Dibangoye, Jilles
Saffidine, Abdallah
Thomas, Vincent
IEEE TRANSACTIONS ON GAMES, 2021, 13 (03) : 239 - 248
[6] Event-Triggered Adaptive Control for Discrete-Time Zero-Sum Games
Wang, Ziyang
Wei, Qinglai
Liu, Derong
Luo, Yanhong
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[7] Neural Q-learning for discrete-time nonlinear zero-sum games with adjustable convergence rate
Wang, Yuan
Wang, Ding
Zhao, Mingming
Liu, Nan
Qiao, Junfei
NEURAL NETWORKS, 2024, 175
[8] Accelerated Value Iteration for Nonlinear Zero-Sum Games with Convergence Guarantee
Yuan Wang
Mingming Zhao
Nan Liu
Ding Wang
Guidance,Navigation and Control, 2024, (01) : 125 - 152
[9] Accelerated Value Iteration for Nonlinear Zero-Sum Games with Convergence Guarantee
Wang, Yuan
Zhao, Mingming
Liu, Nan
Wang, Ding
GUIDANCE NAVIGATION AND CONTROL, 2024, 04 (01)
[10] Adaptive critic designs for discrete-time zero-sum games with application to H∞ control
Al-Tamimi, Asma
Abu-Khalaf, Murad
Lewis, Frank L.
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (01): : 240 - 247

← 1 2 3 4 5 →