Identifiability in training neural networks for reconfigurable control based on reinforcement learning

被引：0

作者：

de Weerdt, E. ^{[1
]}

Chu, Q. P. ^{[1
]}

Mulder, J. A. ^{[1
]}

机构：

[1] Delft Univ Technol, Fac Aerosp Engn, POB 5058, NL-3600 GB Delft, Netherlands

来源：

PROCEEDINGS OF THE NINTH IASTED INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL | 2006年

关键词：

reconfigurable control; reinforcement learning; neural networks; Newton-Gauss; parameter identifiability; recency-effect;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The field of reconfigurable control has become more and more active in the last few years. Many control strategies for constructing an autonomous adapting control system have been developed in the past. However, many of those strategies require a Failure Detection and Isolation (FDI) system such that the control laws can be adapted properly. The major drawback of an FDI system is that the designer needs to foresee all possible failure scenarios, something virtually impossible for even simple systems. In this paper, a reconfigurable control system based on Reinforcement Learning (RL) is proposed. Reinforcement Learning does not require a FDI system, which makes it ideal for reconfigurable control and for controlling unknown plants. Neural networks are used to solve the 'curse of dimensionality' inherent to a RL controller. A novel method of batch updating in combination with a new training algorithm based on the Newton-Gauss method are applied to solve two major problems of neural networks, i.e. the 'recency'-effect and the identifiability problem. Closely related to the identifiability problem is the optimization of the neural network structure. To guarantee the optimal amount of network parameters an optimization algorithm is proposed. The techniques are implemented on a simple system identification and a pole balancing task. Experiments show that the use of,anti-recency' points circumvent the recency-effect in case of system identification and speed up stabilization for the RL controller.

引用

页码：7 / +

页数：2

共 11 条

[1]

BAIRD LC, 1994, 1994 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOL 1-7, P2448, DOI 10.1109/ICNN.1994.374604

[2] LEARNING TO ACT USING REAL-TIME DYNAMIC-PROGRAMMING [J].

BARTO, AG ;

BRADTKE, SJ ;

SINGH, SP .

ARTIFICIAL INTELLIGENCE, 1995, 72 (1-2) :81-138

[3]

Caliskan F, 2000, P AMER CONTR CONF, P1220, DOI 10.1109/ACC.2000.876694

[4]

COPELAND RP, 1994, P IEEE 1994 NAT AER, V1, P579

[5] Reinforcement learning in continuous time and space [J].

Doya, K .

NEURAL COMPUTATION, 2000, 12 (01) :219-245

[6] Apache helicopter stabilization using neural dynamic programming [J].

Enns, R ;

Si, J .

JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2002, 25 (01) :19-25

[7]

Groszkiewicz JE, 1995, PROCEEDINGS OF THE 34TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-4, P1159, DOI 10.1109/CDC.1995.480248

[8] Reinforcement learning: A survey [J].

Kaelbling, LP ;

Littman, ML ;

Moore, AW .

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 :237-285

[9] RECONFIGURABLE FLIGHT CONTROL VIA MULTIPLE MODEL ADAPTIVE-CONTROL METHODS [J].

MAYBECK, PS ;

STEVENS, RD .

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 1991, 27 (03) :470-480

[10]

MCLEAN D, 1993, FAULT DIAGNOSIS CONT

← 1 2 →