Finite-horizon optimal control for unknown systems with saturating control inputs

被引:0
|
作者
Cui X.-H. [1 ,2 ]
Luo Y.-H. [1 ]
Zhang H.-G. [1 ]
Zu P.-F. [2 ]
机构
[1] School of Information Science and Engineering, Northeastern University, Shenyang, 110819, Liaoning
[2] Institute of Mathematical Sciences, Mudanjiang Normal College, Mudanjiang, 157011, Heilongjiang
来源
Zhang, Hua-Guang (hgzhang@ieee.org) | 2016年 / South China University of Technology卷 / 33期
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Adaptive dynamic programming; Finite-horizon; Neural network; Optimal control;
D O I
10.7641/CTA.2016.41195
中图分类号
学科分类号
摘要
An adaptive dynamic programming (ADP)-based online integral reinforcement learning algorithm is designed for finite-horizon optimal control of nonlinear continuous-time systems with saturating control inputs and partially unknown dynamics. Moreover, the convergence of the algorithm is proved. Firstly, the control constraints are handled through nonquadratic function. Secondly, a single neural network (NN) with constant weights and time-dependent activation functions is designed in order to approximate the unknown and continuous value function. Compared with the traditional dual neural networks, the burden of computation by the single NN is lessened. Meanwhile, the NN weights are updated by the least square method with considering both the residual error and terminal error. Furthermore, the convergence of iterative value function on the base of NN is proved. Lastly, two simulation examples show the effectiveness of the proposed algorithm. © 2016, Editorial Department of Control Theory & Applications South China University of Technology. All right reserved.
引用
收藏
页码:631 / 637
页数:6
相关论文
共 20 条
  • [1] Werbos P.J., Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, pp. 23-38, (1992)
  • [2] Lewis F.L., Syrmos V.L., Optimal Control, pp. 213-260, (1995)
  • [3] Liu D., Wei Q., Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems, IEEE Transactions on Neural Networks and Learning Systems, 25, 3, pp. 621-634, (2014)
  • [4] Beard R.W., Improving the closed-loop performance of nonlinear systems, (1995)
  • [5] Beard R.W., Saridis G.N., Wen J.T., Approximate solutions to the time-invariant Hamilton-Jacobi-Bellman equation, Journal of Optimization Theory and Applications, 96, 3, pp. 589-626, (1998)
  • [6] Abu-Khalaf M., Lewis F., Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach, Automatica, 41, 5, pp. 779-791, (2005)
  • [7] Vrabie D., Lewis F., Neural network approach to continuoustime direct adaptive optimal control for partially unknown nonlinear systems, Neural Networks, 22, 3, pp. 237-246, (2009)
  • [8] Zhang H.G., Qin C.B., Luo Y.H., Neural-network-based constrained optimal control scheme for discrete-time switched nonlinear system using dual heuristic programming, IEEE Transactions on Automation Science and Engineering, 11, 3, pp. 839-849, (2014)
  • [9] Zhang H.G., Qin C.B., Jiang B., Et al., Online adaptive policy learning algorithm for H<sub>∞</sub> state feedback control of unknown affine nonlinear discrete-time systems, IEEE Transactions on Cybernetics, 44, 12, pp. 2706-2718, (2014)
  • [10] Zhang H.G., Zhang J.L., Yang G.H., Et al., Leader-based optimal coordination control for the consensus problem of multi-agent differential games via fuzzy adaptive dynamic programming, IEEE Transactions on Fuzzy Systems, 23, 1, pp. 152-163, (2015)