Near-optimal stabilization for a class of nonlinear systems with control constraint based on single network greedy iterative DHP algorithm

被引：5

作者：

Luo, Yan-Hong ^{[1
,2
]}

Zhang, Hua-Guang ^{[1
,2
]}

Cao, Ning ^{[2
]}

Chen, Bing ^{[3
]}

机构：

[1] Key Laboratory of Integrated Automation for the Process Industry, Northeastern University

[2] School of Information Science and Engineering, Northeastern University

[3] Institute of Complexity Science, Qingdao University

来源：

Zidonghua Xuebao/ Acta Automatica Sinica | 2009年 / 35卷 / 11期

关键词：

Constraint; Greedy iterative; Neural network; Nonquadratic functional; Optimal control;

D O I：

10.3724/SP.J.1004.2009.01436

中图分类号：

学科分类号：

摘要：

The near-optimal stabilization problem for nonlinear constrained systems is solved by greedy iterative DHP (Dual heuristic programming) algorithm. Considering the control constraint of the system, a nonquadratic functional is first introduced in order to transform the constrained problem into a unconstrained problem. Then based on the costate function, the greedy iterative DHP algorithm is proposed to solve the Hamilton-Jacobi-Bellman (HJB) equation of the system. At each step of the iterative algorithm, a neural network is utilized to approximate the costate function, and then the optimal control policy of the system can be computed directly according to the costate function, which removes the action network appearing in the ordinary approximate dynamic programming (ADP) method. Finally, two examples are given to demonstrate the validity and feasibility of the proposed optimal control scheme. © 2009 Acta Automatica Sinica. All rights reserved.

引用

页码：1436 / 1445

页数：9

共 21 条

[1]

Widrow B., Gupta N.K., Maitra S., Punish/reward: learning with a critic in adaptive threshold systems, IEEE Transactions on Systems, Man, and Cybernetics, 3, 5, pp. 455-465, (1973)

[2]

Barto A.G., Sutton R.S., Anderson C.W., Neuronlike adaptive elements that can solve difficult learning control problems, IEEE Transactions on Systems, Man, and Cybernetics, 13, 5, pp. 835-846, (1983)

[3]

Werbos P.J., Approximate dynamic programming for real-time control and neural modeling, Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, (1992)

[4]

Bertsekas D.P., Tsitsiklis J.N., Neuro-Dynamic Programming, (1996)

[5]

Prokhorov D.V., Wunsch D.C., Adaptive critic designs, IEEE Transactions on Neural Networks, 8, 5, pp. 997-1007, (1997)

[6]

Si J., Wang Y.T., Online learning control by association and reinforcement, IEEE Transactions on Neural Networks, 12, 2, pp. 264-276, (2001)

[7]

Liu D.R., Xiong X.X., Zhang Y., Action-dependent adaptive critic designs, Proceedings of the International Joint Conference on Neural Networks, pp. 990-995, (2001)

[8]

Murray J.J., Cox C.J., Lendaris G.G., Saeks R., Adaptive dynamic programming, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 32, 2, pp. 140-153, (2002)

[9]

Abu-Khalaf M., Lewis F.L., Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach, Automatica, 41, 5, pp. 779-791, (2005)

[10]

Liu D.-R., Approximate dynamic programming for self-learning control, Acta Automatica Sinica, 31, 1, pp. 13-18, (2005)

← 1 2 3 →