Constrained Online Optimal Control for Continuous-Time Nonlinear Systems Using Neuro-Dynamic Programming

被引:0
作者
Yang Xiong [1 ]
Liu Derong [1 ]
Wang Ding [1 ]
Ma Hongwen [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
来源
2014 33RD CHINESE CONTROL CONFERENCE (CCC) | 2014年
关键词
Constrained input; Neuro-dynamic programming; Nonlinear systems; Online control; Optimal control;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper develops an online adaptive optimal control scheme to solve the infinite-horizon optimal control problem of continuous-time nonlinear systems with control constraints. A novel architecture is presented to approximate the Hamilton-Jacobi-Bellman equation. That is, only a critic neural network is used to derive the optimal control instead of typical actioncritic dual networks employed in neuro-dynamic programming methods. Meanwhile, unlike existing tuning laws for the critic, the newly developed critic update rule not only ensures convergence of the critic to the optimal control but also guarantees the closed-loop system to be uniformly ultimately bounded. In addition, no initial stabilizing control is required. Finally, an example is provided to verify the effectiveness of the present approach.
引用
收藏
页码:8717 / 8722
页数:6
相关论文
共 20 条
  • [1] Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
    Abu-Khalaf, M
    Lewis, FL
    [J]. AUTOMATICA, 2005, 41 (05) : 779 - 791
  • [2] Neurodynamic programming and zero-sum games for constrained control systems
    Abu-Khalaf, Murad
    Lewis, Frank L.
    Huang, Jie
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (07): : 1243 - 1252
  • [3] [Anonymous], 1999, Neural network control of robot manipulators and nonlinear systems
  • [4] [Anonymous], 1974, Ph.D. Thesis
  • [5] [Anonymous], 2002, NONLINEAR SYSTEMS
  • [6] Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation
    Beard, RW
    Saridis, GN
    Wen, JT
    [J]. AUTOMATICA, 1997, 33 (12) : 2159 - 2177
  • [7] Bellman R. E., 1957, Dynamic programming. Princeton landmarks in mathematics
  • [8] Dierks T, 2010, P AMER CONTR CONF, P1568
  • [9] MULTILAYER FEEDFORWARD NETWORKS ARE UNIVERSAL APPROXIMATORS
    HORNIK, K
    STINCHCOMBE, M
    WHITE, H
    [J]. NEURAL NETWORKS, 1989, 2 (05) : 359 - 366
  • [10] Lewis F. L., 2013, Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, V17