restless bandits;
stochastic scheduling;
index policies;
indexability;
control by price;
semi-Markov decision processes;
dynamic resource allocation;
diminishing returns;
marginal productivity;
efficient frontier;
convex optimization;
bias;
mixed criteria;
make to order;
make to stock;
control of queues;
production-inventory control;
partial conservation laws;
achievable performance;
D O I:
10.1287/moor.1050.0165
中图分类号:
C93 [管理学];
O22 [运筹学];
学科分类号:
070105 ;
12 ;
1201 ;
1202 ;
120202 ;
摘要:
This paper presents a framework grounded on convex optimization and economics ideas to solve by index policies problems of optimal dynamic allocation of effort to a discrete-state (finite or countable) binary-action (work/rest) semi-Markov restless bandit project, elucidating issues raised by previous work. Its contributions include: (i) the concept of a restless bandit's marginal productivity index (MPI), characterizing optimal policies relative to general cost and work measures; (ii) the characterization of indexable restless bandits as those satisfying diminishing marginal returns to work, consistently with a nested family of threshold policies; (iii) sufficient indexability conditions via partial conservation laws (PCLs); (iv) the characterization of the MPI as an optimal marginal productivity rate relative to feasible active-state sets; (v) application to semi-Markov bandits under several criteria, including a new mixed average-bias criterion; and (vi) PCL-indexability analyses and MPIs for optimal service control of make-to-order/make-to-stock queues with convex holding costs, under discounted and average-bias criteria.