Adaptive critic designs

被引：883

作者：

Prokhorov, DV

Wunsch, DC

机构：

[1] Applied Computational Intelligence Laboratory, Department of Electrical Engineering, Texas Tech. University, Lubbock

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS | 1997年 / 8卷 / 05期

基金：

美国国家科学基金会;

关键词：

adaptive critic design (ACD); backpropagation; control; DHP; dynamic programming; GDHP; HDP; heuristic dynamic programming; neural network; neurocontrol; reinforcement learning;

D O I：

10.1109/72.623201

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We discuss a variety of adaptive critic designs (ACD's) for neurocontrol. These are suitable for learning in noisy, nonlinear, and nonstationary environments. They have common roots as generalizations of dynamic programming for neural reinforcement learning approaches. Our discussion of these origins leads to an explanation of three design families: Heuristic dynamic programming (HDP), dual heuristic programming (DHP), and globalized dual heuristic programming (GDHP). The main emphasis is on DHP and GDHP as advanced ACD's. We suggest two new modifications of the original GDHP design that are currently the only working implementations of GDHP. They promise to be useful for many engineering applications in the areas of optimization and optimal control. Based on one of these modifications, we present a unified approach to all ACD's. This leads to a generalized training procedure for ACD's.

引用

页码：997 / 1007

页数：11

共 47 条

[41]

WHITE D, 1992, HDB INTELLIGENT CONT

[42]

WHITE H, 1992, NEURAL NETWORKS, V5, P129

[43] 30 YEARS OF ADAPTIVE NEURAL NETWORKS - PERCEPTRON, MADALINE, AND BACKPROPAGATION [J].

WIDROW, B ;

LEHR, MA .

PROCEEDINGS OF THE IEEE, 1990, 78 (09) :1415-1442

[44] PUNISH REWARD - LEARNING WITH A CRITIC IN ADAPTIVE THRESHOLD SYSTEMS [J].

WIDROW, B ;

GUPTA, NK ;

MAITRA, S .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1973, SMC3 (05) :455-465

[45]

WILLIAMS R, NEURAL COMPUTA, V1, P270

[46]

WUNSCH D, 1995, COMPUT INTELL, P98

[47]

YUAN F, 1995, P WORLD C NEUR NETW, P326

← 1 2 3 4 5 →