Rollout algorithms for stochastic scheduling problems

被引:203
作者
Bertsekas, DP [1 ]
Castañon, DA
机构
[1] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA 02139 USA
[2] Boston Univ, Dept Elect Engn, Burlington, MA 01803 USA
[3] Alphatech Inc, Burlington, MA 01803 USA
关键词
rollout algorithms; scheduling; neuro-dynamic programming;
D O I
10.1023/A:1009634810396
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stochastic scheduling problems are difficult stochastic control problems with combinatorial decision spaces. In this paper we focus on a class of stochastic scheduling problems, the quiz problem and its variations. We discuss the use of heuristics for their solution, and we propose rollout algorithms based on these heuristics which approximate the stochastic dynamic programming algorithm. We show how the rollout algorithms can be implemented efficiently, with considerable savings in computation over optimal algorithms. We delineate circumstances under which the rollout algorithms are guaranteed to perform better than the heuristics on which they are based. We also show computational results which suggest that the performance of the rollout policies is near-optimal, and is substantially better than the performance of their underlying heuristics.
引用
收藏
页码:89 / 108
页数:20
相关论文
共 7 条
  • [1] LEARNING TO ACT USING REAL-TIME DYNAMIC-PROGRAMMING
    BARTO, AG
    BRADTKE, SJ
    SINGH, SP
    [J]. ARTIFICIAL INTELLIGENCE, 1995, 72 (1-2) : 81 - 138
  • [2] Bertsekas D. P., 1997, HEURISTICS, V3, P245
  • [3] Bertsekas D. P., 1996, Neuro Dynamic Programming, V1st
  • [4] Ross SM., 2014, Introduction to stochastic dynamic programming
  • [5] Sutton R. S., 1998, Reinforcement Learning: An Introduction, V22447
  • [6] TESAURO G, 1996, 1996 NEUR INF PROC S
  • [7] Whittle P., 1982, Dynamic Programming and Stochastic Control, V1