Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Admissibility and Termination Analysis

被引:33
|
作者
Wei, Qinglai [1 ]
Liu, Derong [2 ]
Lin, Qiao [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[2] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptive critic designs; adaptive dynamic programming (ADP); approximate dynamic programming; local iteration; neural networks; neurodynamic programming; nonlinear systems; optimal control; OPTIMAL TRACKING CONTROL; ZERO-SUM GAME; NONLINEAR-SYSTEMS; FEEDBACK-CONTROL; CONTROL SCHEME; LEARNING CONTROL; NETWORKS; DESIGN;
D O I
10.1109/TNNLS.2016.2593743
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel local value iteration adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon optimal control problems for discrete-time nonlinear systems. The focuses of this paper are to study admissibility properties and the termination criteria of discrete-time local value iteration ADP algorithms. In the discrete-time local value iteration ADP algorithm, the iterative value functions and the iterative control laws are both updated in a given subset of the state space in each iteration, instead of the whole state space. For the first time, admissibility properties of iterative control laws are analyzed for the local value iteration ADP algorithm. New termination criteria are established, which terminate the iterative local ADP algorithm with an admissible approximate optimal control law. Finally, simulation results are given to illustrate the performance of the developed algorithm.
引用
收藏
页码:2490 / 2502
页数:13
相关论文
共 50 条
  • [21] Advanced value iteration for discrete-time intelligent critic control: A survey
    Zhao, Mingming
    Wang, Ding
    Qiao, Junfei
    Ha, Mingming
    Ren, Jin
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (10) : 12315 - 12346
  • [22] Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays
    Qinglai Wei
    Ding Wang
    Dehua Zhang
    Neural Computing and Applications, 2013, 23 : 1851 - 1863
  • [23] Discrete-Time Generalized Policy Iteration ADP Algorithm With Approximation Errors
    Wei, Qinglai
    Li, Benkai
    Song, Ruizhuo
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 1636 - 1641
  • [24] A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
    Wei QingLai
    Liu DeRong
    SCIENCE CHINA-INFORMATION SCIENCES, 2015, 58 (12) : 1 - 15
  • [25] Invariant Adaptive Dynamic Programming for Discrete-Time Optimal Control
    Zhu, Yuanheng
    Zhao, Dongbin
    He, Haibo
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (11): : 3959 - 3971
  • [26] Convergence Analysis of Value Iteration Adaptive Dynamic Programming for Continuous-Time Nonlinear Systems
    Xiao, Geyang
    Zhang, Huaguang
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (03) : 1639 - 1649
  • [27] Bias-Policy Iteration-Based Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems
    Jiang, Huaiyuan
    Li, Xiang
    Zhou, Bin
    Cao, Xibin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024,
  • [28] Policy Approximation in Policy Iteration Approximate Dynamic Programming for Discrete-Time Nonlinear Systems
    Guo, Wentao
    Si, Jennie
    Liu, Feng
    Mei, Shengwei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (07) : 2794 - 2807
  • [29] Spiking Adaptive Dynamic Programming Based on Poisson Process for Discrete-Time Nonlinear Systems
    Wei, Qinglai
    Han, Liyuan
    Zhang, Tielin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (05) : 1846 - 1856
  • [30] Discrete-Time Nonlinear Generalized Policy Iteration for Optimal Control Using Neural Networks
    Wei, Qinglai
    Liu, Derong
    Yang, Xiong
    NEURAL INFORMATION PROCESSING (ICONIP 2014), PT I, 2014, 8834 : 389 - 396