Approximate Optimal Control of Affine Nonlinear Continuous-Time Systems Using Event-Sampled Neurodynamic Programming

被引：63

作者：

Sahoo, Avimanyu ^{[1
]}

Xu, Hao ^{[2
,3
]}

Jagannathan, Sarangapani ^{[4
]}

机构：

[1] DEI Grp, Millersville, MD 21108 USA

[2] Texas A&M Univ Corpus Christi, Coll Sci & Engn, Corpus Christi, TX 78412 USA

[3] Texas A&M Univ Corpus Christi, Unmanned Syst Res Lab, Corpus Christi, TX 78412 USA

[4] Missouri Univ Sci & Technol, Rolla, MO 65409 USA

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2017年 / 28卷 / 03期

基金：

美国国家科学基金会;

关键词：

Adaptive dynamic programming (ADP); event-sampled control; Hamilton-Jacobi-Bellman (HJB) equation; neural networks (NNs); optimal control; DYNAMICAL-SYSTEMS;

D O I：

10.1109/TNNLS.2016.2539366

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents an approximate optimal control of nonlinear continuous-time systems in affine form by using the adaptive dynamic programming (ADP) with event-sampled state and input vectors. The knowledge of the system dynamics is relaxed by using a neural network (NN) identifier with event-sampled inputs. The value function, which becomes an approximate solution to the Hamilton-Jacobi-Bellman equation, is generated by using event-sampled NN approximator. Subsequently, the NN identifier and the approximated value function are utilized to obtain the optimal control policy. Both the identifier and value function approximator weights are tuned only at the event-sampled instants leading to an aperiodic update scheme. A novel adaptive event sampling condition is designed to determine the sampling instants, such that the approximation accuracy and the stability are maintained. A positive lower bound on the minimum inter-sample time is guaranteed to avoid accumulation point, and the dependence of inter-sample time upon the NN weight estimates is analyzed. A local ultimate boundedness of the resulting nonlinear impulsive dynamical closed-loop system is shown. Finally, a numerical example is utilized to evaluate the performance of the near-optimal design. The net result is the design of an event-sampled ADP-based controller for nonlinear continuous-time systems.

引用

页码：639 / 652

页数：14

共 26 条

[1]

[Anonymous], 2000, Dynamic programming and optimal control

[2] Generalized Hamilton-Jacobi-Blellman formulation-based neural network control of affine nonlinear discrete-time systems [J].

Chen, Zheng ;

Jagannathan, Sarangapani .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (01) :90-106

[3]

Dierks T, 2010, P AMER CONTR CONF, P1568

[4] Online Optimal Control of Affine Nonlinear Discrete-Time Systems With Unknown Internal Dynamics by Using Time-Based Policy Update [J].

Dierks, Travis ;

Jagannathan, Sarangapani .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (07) :1118-1129

[5] Hybrid Dynamical Systems [J].

Goebel, Rafal ;

Sanfelice, Ricardo G. ;

Teel, Andrew R. .

IEEE CONTROL SYSTEMS MAGAZINE, 2009, 29 (02) :28-93

[6]

Haddad WM., 2006, IMPULSIVE HYBRID DYN

[7]

Hayakawa T, 2005, IEEE DECIS CONTR P, P5510

[8]

Jiang Y., 2014, GLOBAL ADAPTIVE DYNA

[9]

Khalil H., 2014, Control of Nonlinear Systems

[10]

Lewis F., 1995, Optimal control

← 1 2 3 →