Dichotomy value iteration with parallel learning design towards discrete-time zero-sum games

被引:3
|
作者
Wang, Jiangyu [1 ,2 ,3 ,4 ]
Wang, Ding [1 ]
Li, Xin [1 ,2 ,3 ,4 ]
Qiao, Junfei [1 ,2 ,3 ,4 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[2] Beijing Univ Technol, Key Lab Computat Intelligence & Intelligent Syst, Beijing 100124, Peoples R China
[3] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
[4] Beijing Univ Technol, Beijing Lab Smart Environm Protect, Beijing 100124, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptive critic; Artificial neural networks; Nonlinear systems; Parallel learning; Value iteration; Zero -sum games; ADAPTIVE CRITIC DESIGNS; STABILITY ANALYSIS;
D O I
10.1016/j.neunet.2023.09.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel parallel learning framework is developed to solve zero-sum games for discrete -time nonlinear systems. Briefly, the purpose of this study is to determine a tentative function according to the prior knowledge of the value iteration (VI) algorithm. The learning process of the parallel controllers can be guided by the tentative function. That is to say, the neighborhood of the optimal cost function can be compressed within a small range via two typical exploration policies. Based on the parallel learning framework, a novel dichotomy VI algorithm is established to accelerate the learning speed. It is shown that the parallel controllers will converge to the optimal policy from contrary initial policies. Finally, two typical systems are used to demonstrate the learning performance of the constructed dichotomy VI algorithm.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:751 / 762
页数:12
相关论文
共 50 条
  • [1] Evolving and Incremental Value Iteration Schemes for Nonlinear Discrete-Time Zero-Sum Games
    Zhao, Mingming
    Wang, Ding
    Ha, Mingming
    Qiao, Junfei
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (07) : 4487 - 4499
  • [2] Adaptive Dynamic Programming for Discrete-Time Zero-Sum Games
    Wei, Qinglai
    Liu, Derong
    Lin, Qiao
    Song, Ruizhuo
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (04) : 957 - 969
  • [3] Discrete-time zero-sum Markov games with first passage criteria
    Liu, Qiuli
    Huang, Xiangxiang
    OPTIMIZATION, 2017, 66 (04) : 571 - 587
  • [4] EXISTENCE OF VALUE AND RANDOMIZED STRATEGIES IN ZERO-SUM DISCRETE-TIME STOCHASTIC DYNAMIC-GAMES
    KUMAR, PR
    SHIAU, TH
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1981, 19 (05) : 617 - 634
  • [5] Heuristic Search Value Iteration for Zero-Sum Stochastic Games
    Buffet, Olivier
    Dibangoye, Jilles
    Saffidine, Abdallah
    Thomas, Vincent
    IEEE TRANSACTIONS ON GAMES, 2021, 13 (03) : 239 - 248
  • [6] Event-Triggered Adaptive Control for Discrete-Time Zero-Sum Games
    Wang, Ziyang
    Wei, Qinglai
    Liu, Derong
    Luo, Yanhong
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [7] Neural Q-learning for discrete-time nonlinear zero-sum games with adjustable convergence rate
    Wang, Yuan
    Wang, Ding
    Zhao, Mingming
    Liu, Nan
    Qiao, Junfei
    NEURAL NETWORKS, 2024, 175
  • [8] Accelerated Value Iteration for Nonlinear Zero-Sum Games with Convergence Guarantee
    Yuan Wang
    Mingming Zhao
    Nan Liu
    Ding Wang
    Guidance,Navigation and Control, 2024, (01) : 125 - 152
  • [9] Accelerated Value Iteration for Nonlinear Zero-Sum Games with Convergence Guarantee
    Wang, Yuan
    Zhao, Mingming
    Liu, Nan
    Wang, Ding
    GUIDANCE NAVIGATION AND CONTROL, 2024, 04 (01)
  • [10] Adaptive critic designs for discrete-time zero-sum games with application to H∞ control
    Al-Tamimi, Asma
    Abu-Khalaf, Murad
    Lewis, Frank L.
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (01): : 240 - 247