Approximate Finite-horizon Optimal Control with Policy Iteration

被引：0

作者：

Zhao Zhengen ^{[1
]}

Yang Ying ^{[1
]}

Li Hao ^{[1
]}

Liu Dan ^{[1
]}

机构：

[1] Peking Univ, State Key Lab Turbulence & Complex Syst, Dept Mech & Engn Sci, Coll Engn, Beijing 100871, Peoples R China

来源：

2014 33RD CHINESE CONTROL CONFERENCE (CCC) | 2014年

关键词：

Finite-horizon; Policy Iteration; Input Constraints; Neural Networks Approximation; HJB Equation; Least Squares; NETWORK HJB APPROACH; NONLINEAR-SYSTEMS; TIME;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, the policy iteration algorithm for the finite-horizon optimal control of continuous time systems is addressed. The finite-horizon optimal control with input constraints is formulated in the Hamilton-Jacobi-Bellman (HJB) equation by using a suitable nonquadratic function. The value function of the HJB equation is obtained by solving a sequence of cost functions satisfying the generalized HJB (GHJB) equations with policy iteration. The convergence of the policy iteration algorithm is proved and the admissibility of each iterative policy is discussed. Using the least squares method with neural networks (NN) approximation of the cost function, the approximate solution of the GHJB equation converges uniformly to that of the HJB equation. A numerical example is given to illustrate the result.

引用

页码：8889 / 8894

页数：6

共 10 条

[1] Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].

Abu-Khalaf, M ;

Lewis, FL .

AUTOMATICA, 2005, 41 (05) :779-791

[2] Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation [J].

Beard, RW ;

Saridis, GN ;

Wen, JT .

AUTOMATICA, 1997, 33 (12) :2159-2177

[3] Fixed-final-time-constrained optimal control, of Nonlinear systems using neural network HJB approach [J].

Cheng, Tao ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 2007, 18 (06) :1725-1737

[4] A neural network solution for fixed-final time optimal control of nonlinear systems [J].

Cheng, Tao ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

AUTOMATICA, 2007, 43 (03) :482-490

[5] Finite-horizon dynamic optimization of nonlinear systems in real time [J].

Costanza, Vicente ;

Rivadeneira, Pablo S. .

AUTOMATICA, 2008, 44 (09) :2427-2434

[6] UNIVERSAL APPROXIMATION OF AN UNKNOWN MAPPING AND ITS DERIVATIVES USING MULTILAYER FEEDFORWARD NETWORKS [J].

HORNIK, K ;

STINCHCOMBE, M ;

WHITE, H .

NEURAL NETWORKS, 1990, 3 (05) :551-560

[7]

Luo B., 2013, 13110396 ARXIV

[8]

Lyshevski SE, 1998, P AMER CONTR CONF, P205, DOI 10.1109/ACC.1998.694659

[9] APPROXIMATION THEORY OF OPTIMAL-CONTROL FOR TRAINABLE MANIPULATORS [J].

SARIDIS, GN .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1979, 9 (03) :152-159

[10] Adaptive optimal control for continuous-time linear systems based on policy iteration [J].

Vrabie, D. ;

Pastravanu, O. ;

Abu-Khalaf, M. ;

Lewis, F. L. .

AUTOMATICA, 2009, 45 (02) :477-484

← 1 →