Perspectives of approximate dynamic programming

被引：67

作者：

Powell, Warren B. ^{[1
]}

机构：

[1] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ 08544 USA

来源：

ANNALS OF OPERATIONS RESEARCH | 2016年 / 241卷 / 1-2期

关键词：

LEARNING CONTROL; SELECTION; ALGORITHMS; NETWORKS;

D O I：

10.1007/s10479-012-1077-6

中图分类号：

C93 [管理学]; O22 [运筹学];

学科分类号：

070105 ; 12 ; 1201 ; 1202 ; 120202 ;

摘要：

Approximate dynamic programming has evolved, initially independently, within operations research, computer science and the engineering controls community, all searching for practical tools for solving sequential stochastic optimization problems. More so than other communities, operations research continued to develop the theory behind the basic model introduced by Bellman with discrete states and actions, even while authors as early as Bellman himself recognized its limits due to the "curse of dimensionality" inherent in discrete state spaces. In response to these limitations, subcommunities in computer science, control theory and operations research have developed a variety of methods for solving different classes of stochastic, dynamic optimization problems, creating the appearance of a jungle of competing approaches. In this article, we show that there is actually a common theme to these strategies, and underpinning the entire field remains the fundamental algorithmic strategies of value and policy iteration that were first introduced in the 1950's and 60's.

引用

页码：319 / 356

页数：38

共 99 条

[1]

[Anonymous], 1999, Neural network control of robot manipulators and nonlinear systems

[2]

[Anonymous], 1997, Introduction to stochastic programming

[3]

[Anonymous], 1977, ART THEORY DYNAMIC P

[4]

[Anonymous], 1991, Simulation modeling and analysis

[5]

[Anonymous], 2011, Approximate dynamic programming: Solving the curses of dimensionality

[6] ASSOCIATIVE SEARCH NETWORK - A REINFORCEMENT LEARNING ASSOCIATIVE MEMORY [J].

BARTO, AG ;

SUTTON, RS ;

BROUWER, PS .

BIOLOGICAL CYBERNETICS, 1981, 40 (03) :201-211

[7] NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS [J].

BARTO, AG ;

SUTTON, RS ;

ANDERSON, CW .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05) :834-846

[8] DYNAMIC PROGRAMMING [J].

BELLMAN, R .

SCIENCE, 1966, 153 (3731) :34-&

[9]

Bellman R., 1971, Introduction to the Mathematical Theory of Control Processes, V2

[10]

Bellman Richard, 1959, MATH TABLES OTHER AI, V13, P247, DOI DOI 10.2307/2002797

← 1 2 3 4 5 6 7 8 9 10 →