Optimal Tracking Control for a Class of Nonlinear Discrete-Time Systems with Time Delays Based on Heuristic Dynamic Programming

被引:160
作者
Zhang, Huaguang [1 ,2 ]
Song, Ruizhuo [1 ]
Wei, Qinglai [3 ]
Zhang, Tieyan [4 ]
机构
[1] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110004, Peoples R China
[2] Northeastern Univ, Natl Educ Minist, Key Lab Integrated Automat Proc Ind, Shenyang 110004, Peoples R China
[3] Chinese Acad Sci, Inst Automat, Key Lab Complex Syst & Intelligence Sci, Beijing 100190, Peoples R China
[4] Shenyang Inst Engn, Dept Elect Engn, Shenyang 110136, Peoples R China
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2011年 / 22卷 / 12期
基金
中国国家自然科学基金;
关键词
Adaptive dynamic programming; approximate dynamic programming; heuristic dynamic programming iteration; optimal control; time delays; ZERO-SUM GAMES; NEURAL-NETWORKS; MULTIPLE DELAYS; CONTROLLABILITY; DESIGNS;
D O I
10.1109/TNN.2011.2172628
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel heuristic dynamic programming (HDP) iteration algorithm is proposed to solve the optimal tracking control problem for a class of nonlinear discrete-time systems with time delays. The novel algorithm contains state updating, control policy iteration, and performance index iteration. To get the optimal states, the states are also updated. Furthermore, the "backward iteration" is applied to state updating. Two neural networks are used to approximate the performance index function and compute the optimal control policy for facilitating the implementation of HDP iteration algorithm. At last, we present two examples to demonstrate the effectiveness of the proposed HDP iteration algorithm.
引用
收藏
页码:1851 / 1862
页数:12
相关论文
共 54 条
[1]   Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].
Al-Tamimi, Asma ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04) :943-949
[2]   Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control [J].
Al-Tamimi, Asma ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
AUTOMATICA, 2007, 43 (03) :473-481
[3]   Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].
Al-Tamimi, Asma ;
Lewis, Frank .
2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, :38-+
[4]   Adaptive critic designs for discrete-time zero-sum games with application to H∞ control [J].
Al-Tamimi, Asma ;
Abu-Khalaf, Murad ;
Lewis, Frank L. .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (01) :240-247
[5]  
[Anonymous], 1992, HDB INTELLIGENT CONT
[6]  
Beard R., 1995, IMPROVING CLOSED LOO
[7]  
Bellman R. E., 1957, Dynamic programming. Princeton landmarks in mathematics
[8]  
Ben-Israel A., 2002, GEN INVERSE THEORY A, V2nd ed.
[9]   On the reachability of quantized control systems [J].
Bicchi, A ;
Marigo, A ;
Piccoli, B .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2002, 47 (04) :546-563
[10]  
BRADTKE SJ, 1994, PROCEEDINGS OF THE 1994 AMERICAN CONTROL CONFERENCE, VOLS 1-3, P3475