A novel stable value iteration-based approximate dynamic programming algorithm for discrete-time nonlinear systems

被引:0
作者
曲延华 [1 ]
王安娜 [1 ]
林盛 [1 ]
机构
[1] College of Information Science and Engineering, Northeastern University
关键词
adaptive dynamic programming(ADP); convergence; stability; discounted quadric performance index;
D O I
暂无
中图分类号
O221 [规划论(数学规划)];
学科分类号
070105 ; 1201 ;
摘要
The convergence and stability of a value-iteration-based adaptive dynamic programming(ADP) algorithm are considered for discrete-time nonlinear systems accompanied by a discounted quadric performance index. More importantly than sufficing to achieve a good approximate structure, the iterative feedback control law must guarantee the closed-loop stability. Specifically, it is firstly proved that the iterative value function sequence will precisely converge to the optimum.Secondly, the necessary and sufficient condition of the optimal value function serving as a Lyapunov function is investigated. We prove that for the case of infinite horizon, there exists a finite horizon length of which the iterative feedback control law will provide stability, and this increases the practicability of the proposed value iteration algorithm. Neural networks(NNs) are employed to approximate the value functions and the optimal feedback control laws, and the approach allows the implementation of the algorithm without knowing the internal dynamics of the system. Finally, a simulation example is employed to demonstrate the effectiveness of the developed optimal control method.
引用
收藏
页码:232 / 239
页数:8
相关论文
共 2 条
  • [1] Prokhorov D V,Wunsch D C. IEEE Trans.Neural Netw . 1997
  • [2] Hong Y Y,Qiang Z,Qi J Z. Chin.Phys.B . 2017