Off-Policy Actor-Critic Structure for Optimal Control of Unknown Systems With Disturbances

被引：209

作者：

Song, Ruizhuo ^{[1
]}

Lewis, Frank L. ^{[2
,3
]}

Wei, Qinglai ^{[4
]}

Zhang, Huaguang ^{[5
]}

机构：

[1] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China

[2] Univ Texas Arlington, UTA Res Inst, Ft Worth, TX 76118 USA

[3] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110004, Peoples R China

[4] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China

[5] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110004, Peoples R China

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2016年 / 46卷 / 05期

基金：

美国国家科学基金会; 中国国家自然科学基金; 北京市自然科学基金;

关键词：

Adaptive critic designs; adaptive/approximate dynamic programming (ADP); dynamic programming; off-policy; optimal control; unknown system; OPTIMAL TRACKING CONTROL; ADAPTIVE OPTIMAL-CONTROL; TIME NONLINEAR-SYSTEMS; OPTIMAL-CONTROL SCHEME; FEEDBACK-CONTROL; ALGORITHM; ITERATION; DESIGN;

D O I：

10.1109/TCYB.2015.2421338

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

An optimal control method is developed for unknown continuous-time systems with unknown disturbances in this paper. The integral reinforcement learning (IRL) algorithm is presented to obtain the iterative control. Off-policy learning is used to allow the dynamics to be completely unknown. Neural networks are used to construct critic and action networks. It is shown that if there are unknown disturbances, off-policy IRL may not converge or may be biased. For reducing the influence of unknown disturbances, a disturbances compensation controller is added. It is proven that the weight errors are uniformly ultimately bounded based on Lyapunov techniques. Convergence of the Hamiltonian function is also proven. The simulation study demonstrates the effectiveness of the proposed optimal control method for unknown systems with disturbances.

引用

页码：1041 / 1050

页数：10

共 44 条

[1] Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].

Abu-Khalaf, M ;

Lewis, FL .

AUTOMATICA, 2005, 41 (05) :779-791

[2] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].

Al-Tamimi, Asma ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04) :943-949

[3]

[Anonymous], 1999, Neural network control of robot manipulators and nonlinear systems

[4] Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation [J].

Beard, RW ;

Saridis, GN ;

Wen, JT .

AUTOMATICA, 1997, 33 (12) :2159-2177

[5] A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems [J].

Bhasin, S. ;

Kamalapurkar, R. ;

Johnson, M. ;

Vamvoudakis, K. G. ;

Lewis, F. L. ;

Dixon, W. E. .

AUTOMATICA, 2013, 49 (01) :82-92

[6] Approximation-Based Adaptive Neural Control Design for a Class of Nonlinear Systems [J].

Chen, Bing ;

Liu, Kefu ;

Liu, Xiaoping ;

Shi, Peng ;

Lin, Chong ;

Zhang, Huaguang .

IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (05) :610-619

[7] NONLINEAR CONTROL VIA APPROXIMATE INPUT OUTPUT LINEARIZATION - THE BALL AND BEAM EXAMPLE [J].

HAUSER, J ;

SASTRY, S ;

KOKOTOVIC, P .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1992, 37 (03) :392-398

[8] Finite-Horizon Control-Constrained Nonlinear Optimal Control Using Single Network Adaptive Critics [J].

Heydari, Ali ;

Balakrishnan, Sivasubramanya N. .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (01) :145-157

[9] DEGREE OF APPROXIMATION RESULTS FOR FEEDFORWARD NETWORKS APPROXIMATING UNKNOWN MAPPINGS AND THEIR DERIVATIVES [J].

HORNIK, K ;

STINCHCOMBE, M ;

WHITE, H ;

AUER, P .

NEURAL COMPUTATION, 1994, 6 (06) :1262-1275

[10] Robust Adaptive Dynamic Programming and Feedback Stabilization of Nonlinear Systems [J].

Jiang, Yu ;

Jiang, Zhong-Ping .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (05) :882-893

← 1 2 3 4 5 →