Approximate dynamic programming strategies and their applicability for process control: A review and future directions

被引：1

作者：

Lee, JM ^{[1
]}

Lee, JH ^{[1
]}

机构：

[1] Georgia Inst Technol, Sch Chem & Biomol Engn, Atlanta, GA 30332 USA

来源：

INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS | 2004年 / 2卷 / 03期

关键词：

approximate dynamic programming; reinforcement learning; neuro-dynamic programming; optimal control; function approximation;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper reviews dynamic programming (DP), surveys approximate solution methods for it, and considers their applicability to process control problems. Reinforcement Learning (RL) and Neuro-Dynamic Programming (NDP), which can be viewed as approximate DP techniques, are already established techniques for solving difficult multi-stage decision problems in the fields of operations research, computer science, and robotics. Owing to the significant disparity of problem formulations and objective, however, the algorithms and techniques available from these fields are not directly applicable to process control problems, and reformulations based on accurate understanding of these techniques are needed. We categorize the currently available approximate solution techniques for dynamic programming and identify those most suitable for process control problems. Several open issues are also identified and discussed.

引用

页码：263 / 278

页数：16

共 105 条

[1] Ahamed TPI, 2002, ELECTR POW SYST RES, V63, P9, DOI 10.1016/S0378-7796(02)00088-3
[2] Albus J. S., 1975, Transactions of the ASME. Series G, Journal of Dynamic Systems, Measurement and Control, V97, P220, DOI 10.1115/1.3426922
[3] Albus J. S., 1975, J DYNAMIC SYSTEMS ME, V97, P228
[4] Anderson C. W., 1989, IEEE Control Systems Magazine, V9, P31, DOI 10.1109/37.24809
[5] Synthesis of reinforcement learning, neural networks and PI control applied to a simulated heating coil
Anderson, CW
Hittle, DC
Katz, AD
Kretchmar, RM
[J]. ARTIFICIAL INTELLIGENCE IN ENGINEERING, 1997, 11 (04): : 421 - 429
[6] [Anonymous], 1993, P CONN MOD SUMM SCH
[7] [Anonymous], 2000, DYNAMIC PROGRAMMING
[8] [Anonymous], THESIS NE U BOSTON
[9] Purposive behavior acquisition for a real robot by vision-based reinforcement learning
Asada, M
Noda, S
Tawaratsumida, S
Hosoda, K
[J]. MACHINE LEARNING, 1996, 23 (2-3) : 279 - 303
[10] DUAL CONTROL OF AN INTEGRATOR WITH UNKNOWN GAIN
ASTROM, KJ
HELMERSSON, A
[J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS-PART A, 1986, 12 (06): : 653 - 662

← 1 2 3 4 5 6 7 8 9 10 →