Reinforcement learning-based output feedback control of nonlinear systems with input constraints

被引：96

作者：

He, P ^{[1
]}

Jagannathan, S ^{[1
]}

机构：

[1] Univ Missouri, Dept Elect & Comp Engn, Rolla, MO 65409 USA

来源：

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS | 2005年 / 35卷 / 01期

关键词：

neural networks (NNs); output feedback control; reinforcement learning;

D O I：

10.1109/TSMCB.2004.840124

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A novel neural network (NN)-based output feedback controller with magnitude constraints is designed to deliver a desired tracking performance for a class of multi-input and multi-output (MIMO) strict feedback nonlinear discrete-time systems. Reinforcement learning is proposed for the output feedback controller, which uses three NNs: 1) an NN observer to estimate the system states with the input-output data, 2) a critic NN to approximate certain strategic utility function, and 3) an action NN to minimize both the strategic utility function and the unknown dynamics estimation errors. Using the Lyapunov approach, the uniformly ultimate boundedness (UUB) of the state estimation errors, the tracking errors and weight estimates is shown.

引用

页码：150 / 154

页数：5

共 9 条

[1]

Bertsekas D., 1996, NEURO DYNAMIC PROGRA, V1st

[2] ADAPTIVE-CONTROL OF A CLASS OF NONLINEAR DISCRETE-TIME-SYSTEMS USING NEURAL NETWORKS [J].

CHEN, FC ;

KHALIL, HK .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1995, 40 (05) :791-801

[3] Adaptive NN control for a class of discrete-time non-linear systems [J].

Ge, SS ;

Lee, TH ;

Li, GY ;

Zhang, J .

INTERNATIONAL JOURNAL OF CONTROL, 2003, 76 (04) :334-354

[4] STOCHASTIC CHOICE OF BASIS FUNCTIONS IN ADAPTIVE FUNCTION APPROXIMATION AND THE FUNCTIONAL-LINK NET [J].

IGELNIK, B ;

PAO, YH .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 1995, 6 (06) :1320-1329

[5]

Kristic M., 1995, Nonlinear and Adaptive Control Design

[6]

Liu X, 2000, P AMER CONTR CONF, P1929, DOI 10.1109/ACC.2000.879538

[7] On-line learning control by association and reinforcement [J].

Si, J ;

Wang, YT .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 2001, 12 (02) :264-276

[8]

Webros P. J., 1992, HDB INTELLIGENT CONT, P65

[9] ADAPTIVE OUTPUT-FEEDBACK DESIGN FOR A CLASS OF NONLINEAR DISCRETE-TIME-SYSTEMS [J].

YEH, PC ;

KOKOTOVIC, PV .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1995, 40 (09) :1663-1668

← 1 →