Reinforcement learning control of nonlinear multi-link system

被引:19
作者
Bucak, IO [1 ]
Zohdy, MA [1 ]
机构
[1] Oakland Univ, Dept Elect & Syst Engn, Rochester, MI 48309 USA
关键词
reinforcement learning; learning control; nonlinear control; robotics; learning algorithm;
D O I
10.1016/S0952-1976(01)00031-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, the effects of basic parameters in reinforcement learning control such as eligibility, action and critic network constrained weights, system nonlinearities, gradient information, state-space partitioning, variance of exploration are studied in detail. It is attempted to increase feasibility for practical applications, implementation, learning efficiency, and enhance performance. Also, a novel adaptive grid algorithm is proposed to overcome the difficulty in partitioning the input space to achieve better performance. Reinforcement learning is applied for control of a nonlinear one and two-link robots. This problem dictates that the learning is performed on-line, based on a binary or real-valued reinforcement signal from a critic network, without knowing the system model or nonlinearity. (C) 2002 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:563 / 575
页数:13
相关论文
共 16 条
[1]  
Anderson C. W., 1989, IEEE Control Systems Magazine, V9, P31, DOI 10.1109/37.24809
[2]  
Bertsekas D., 1996, NEURO DYNAMIC PROGRA, V1st
[3]  
Bucack I. O., 1999, Proceedings of the 1999 American Control Conference (Cat. No. 99CH36251), P1198, DOI 10.1109/ACC.1999.783230
[4]  
Bucak IO, 1998, P AMER CONTR CONF, P1405, DOI 10.1109/ACC.1998.707055
[5]   A self-learning fuzzy logic controller using genetic algorithms with reinforcements [J].
Chiang, CK ;
Chung, HY ;
Lin, JJ .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1997, 5 (03) :460-467
[6]  
COSTA EF, 1997, P AM CONTR C AM AUT
[7]   A STOCHASTIC REINFORCEMENT LEARNING ALGORITHM FOR LEARNING REAL-VALUED FUNCTIONS [J].
GULLAPALLI, V .
NEURAL NETWORKS, 1990, 3 (06) :671-692
[8]  
Gullapalli V., 1994, IEEE CONTROL SYSTEMS
[9]  
HIRASHIMA Y, 1999, P 38 C DE CONTR IEEE
[10]  
HIRZINGER G, 1996, IEEE ASME T MECHATRO, V1, P150