Global optimality of approximate dynamic programming and its use in non-convex function minimization

被引:17
作者
Heydari, Ali [1 ]
Balakrishnan, S. N. [2 ]
机构
[1] South Dakota Sch Mines & Technol, Dept Mech Engn, Rapid City, SD 57701 USA
[2] Missouri Univ Sci & Technol, Mech & Aerosp Engn Dept, Rolla, MO 65409 USA
关键词
Approximate dynamic programming; Fixed final time optimal control; Neural networks; Non-convex function minimization; NONLINEAR-SYSTEMS;
D O I
10.1016/j.asoc.2014.07.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This study investigates the global optimality of approximate dynamic programming (ADP) based solutions using neural networks for optimal control problems with fixed final time. Issues including whether or not the cost function terms and the system dynamics need to be convex functions with respect to their respective inputs are discussed and sufficient conditions for global optimality of the result are derived. Next, a new idea is presented to use ADP with neural networks for optimization of non-convex smooth functions. It is shown that any initial guess leads to direct movement toward the proximity of the global optimum of the function. This behavior is in contrast with gradient based optimization methods in which the movement is guided by the shape of the local level curves. Illustrative examples are provided with single and multi-variable functions that demonstrate the potential of the proposed method. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:291 / 303
页数:13
相关论文
共 32 条
  • [1] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
    Al-Tamimi, Asma
    Lewis, Frank L.
    Abu-Khalaf, Murad
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04): : 943 - 949
  • [2] Approximations of functions by a multilayer perceptron: a new approach
    Attali, JG
    Pages, G
    [J]. NEURAL NETWORKS, 1997, 10 (06) : 1069 - 1081
  • [3] Adaptive-critic-based neural networks for aircraft optimal control
    Balakrishnan, SN
    Biega, V
    [J]. JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 1996, 19 (04) : 893 - 898
  • [4] Bertsekas D. P, 1999, Nonlinear Programming, V2nd
  • [5] Boyd S., 2009, CONVEX OPTIMIZATION, V71, p[4, 459]
  • [6] Chachuat B., 2007, Nonlinear and Dynamic Optimization: From Theory to Practice, P120
  • [7] Chen C.-T., 1999, LINEAR SYSTEM THEORY, P131
  • [8] Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence
    Dierks, Travis
    Thumati, Balaje T.
    Jagannathan, S.
    [J]. NEURAL NETWORKS, 2009, 22 (5-6) : 851 - 860
  • [9] Ding J., P AIAA GUID NAV
  • [10] Online adaptive critic flight control
    Ferrari, S
    Stengel, RF
    [J]. JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2004, 27 (05) : 777 - 786