Dichotomy value iteration with parallel learning design towards discrete-time zero-sum games

被引:3
|
作者
Wang, Jiangyu [1 ,2 ,3 ,4 ]
Wang, Ding [1 ]
Li, Xin [1 ,2 ,3 ,4 ]
Qiao, Junfei [1 ,2 ,3 ,4 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[2] Beijing Univ Technol, Key Lab Computat Intelligence & Intelligent Syst, Beijing 100124, Peoples R China
[3] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
[4] Beijing Univ Technol, Beijing Lab Smart Environm Protect, Beijing 100124, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptive critic; Artificial neural networks; Nonlinear systems; Parallel learning; Value iteration; Zero -sum games; ADAPTIVE CRITIC DESIGNS; STABILITY ANALYSIS;
D O I
10.1016/j.neunet.2023.09.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel parallel learning framework is developed to solve zero-sum games for discrete -time nonlinear systems. Briefly, the purpose of this study is to determine a tentative function according to the prior knowledge of the value iteration (VI) algorithm. The learning process of the parallel controllers can be guided by the tentative function. That is to say, the neighborhood of the optimal cost function can be compressed within a small range via two typical exploration policies. Based on the parallel learning framework, a novel dichotomy VI algorithm is established to accelerate the learning speed. It is shown that the parallel controllers will converge to the optimal policy from contrary initial policies. Finally, two typical systems are used to demonstrate the learning performance of the constructed dichotomy VI algorithm.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:751 / 762
页数:12
相关论文
共 50 条
  • [21] On-Policy and Off-Policy Value Iteration Algorithms for Stochastic Zero-Sum Dynamic Games
    Guo, Liangyuan
    Wang, Bing-Chang
    Zhang, Ji-Feng
    JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY, 2025, 38 (01) : 421 - 435
  • [22] Novel single-loop policy iteration for linear zero-sum games
    Zhao, Jianguo
    Yang, Chunyu
    Gao, Weinan
    Park, Ju H.
    AUTOMATICA, 2024, 163
  • [23] Discrete-Time Non-Zero-Sum Games With Completely Unknown Dynamics
    Song, Ruizhuo
    Wei, Qinglai
    Zhang, Huaguang
    Lewis, Frank L.
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (06) : 2929 - 2943
  • [24] Uniform continuity of the value of zero-sum games with differential information
    Einy, Ezra
    Haimanko, Ori
    Moreno, Diego
    Shitovitz, Benyamin
    MATHEMATICS OF OPERATIONS RESEARCH, 2008, 33 (03) : 552 - 560
  • [25] H∞ Consensus for Discrete-Time Fractional-Order Multi-Agent Systems With Disturbance via Q-Learning in Zero-Sum Games
    An, Chunlan
    Su, Housheng
    Chen, Shiming
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2022, 9 (04): : 2803 - 2814
  • [26] EXISTENCE OF THE UNIFORM VALUE IN ZERO-SUM REPEATED GAMES WITH A MORE INFORMED CONTROLLER
    Gensbittel, Fabien
    Oliu-Barton, Miquel
    Venel, Xavier
    JOURNAL OF DYNAMICS AND GAMES, 2014, 1 (03): : 411 - 445
  • [27] Advanced value iteration for discrete-time intelligent critic control: A survey
    Zhao, Mingming
    Wang, Ding
    Qiao, Junfei
    Ha, Mingming
    Ren, Jin
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (10) : 12315 - 12346
  • [28] Asymmetric Constrained Optimal Tracking Control With Critic Learning of Nonlinear Multiplayer Zero-Sum Games
    Qiao, Junfei
    Li, Menghua
    Wang, Ding
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 5671 - 5683
  • [29] Event-triggered Control Design for Optimal Tracking of Unknown Nonlinear Zero-sum Games
    Wang D.
    Hu L.-Z.
    Zhao M.-M.
    Ha M.-M.
    Qiao J.-F.
    Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (01): : 91 - 101
  • [30] Event-triggered optimal control for discrete-time multi-player non-zero-sum games using parallel control
    Lu, Jingwei
    Wei, Qinglai
    Wang, Ziyang
    Zhou, Tianmin
    Wang, Fei-Yue
    INFORMATION SCIENCES, 2022, 584 : 519 - 535