Markov decision process;
discounted reward;
average reward;
random walk;
stochastic knapsack problem;
turnpike;
D O I:
10.1137/S0040585X97T991325
中图分类号:
O21 [概率论与数理统计];
C8 [统计学];
学科分类号:
020208 ;
070103 ;
0714 ;
摘要:
In this paper we revise the theory of turnpikes in discounted Markov decision pro-cesses, prove the turnpike theorem for the undiscounted model, and apply the results to the specific random walk.