Dynamic Control of Birth-and-Death Restless Bandits: Application to Resource-Allocation Problems

被引：28

作者：

Larranaga, Maialen ^{[1
]}

Ayesta, Urtzi ^{[2
,3
,4
,5
]}

Verloop, Ina Maria ^{[5
,6
]}

机构：

[1] Cent Supelec, CNRS, Signaux & Syst Lab, F-91190 Gif Sur Yvette, France

[2] CNRS, Lab Anal & Architecture Syst, F-31400 Toulouse, France

[3] Basque Fdn Sci, Ikerbasque, Bilbao 48011, Spain

[4] Univ Basque Country, UPV EHU, Donostia San Sebastian 20018, Spain

[5] Univ Toulouse, Lab Anal & Architecture Syst, Inst Natl Polytech, F-31400 Toulouse, France

[6] CNRS, Inst Rech Informat Toulouse, F-31071 Toulouse, France

来源：

IEEE-ACM TRANSACTIONS ON NETWORKING | 2016年 / 24卷 / 06期

关键词：

Whittle's index; restless bandits; birth-and-death processes; fluid approximation; Lagrangian relaxation; index policies; INDEX POLICIES; QUEUE; PERFORMANCE; RULE;

D O I：

10.1109/TNET.2016.2562564

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We develop a unifying framework to obtain efficient index policies for restless multi-armed bandit problems with birth-and-death state evolution. This is a broad class of stochastic resource allocation problems whose objective is to determine efficient policies to share resources among competing projects. In a seminal work, Whittle developed a methodology to derive well-performing (Whittle's) index policies that are obtained by solving a relaxed version of the original problem. Our first main contribution is the derivation of a closed-form expression for Whittle's index as a function of the steady-state probabilities. In some particular cases, qualitative insights can be obtained from its expression; nevertheless, it requires several technical conditions to be verified. We, therefore, formulate a fluid version of the relaxed optimization problem, and in our second main contribution, we develop a fluid index policy. The latter does provide qualitative insights and it is equivalent to Whittle's index policy in the light-traffic regime. The applicability of our approach is illustrated by two important problems: optimal class selection and optimal load balancing. Allowing state-dependent capacities, we can model important phenomena, e.g., power-aware server-farms and opportunistic scheduling in wireless systems. Whittle's index and our fluid index policy show remarkably good performance in numerical simulations.

引用

页码：3812 / 3825

页数：14

共 29 条

[1]

Aalto Samuli, 2015, ACM SIGMETRICS Performance Evaluation Review, V43, P57, DOI 10.1145/2745844.2745851

[2] Whittle's index policy for a multi-class queueing system with convex holding costs [J].

P. S. Ansell ;

K. D. Glazebrook ;

J. Niño-Mora ;

M. O'Keeffe .

Mathematical Methods of Operations Research, 2003, 57 (1) :21-39

[3] The cμ/θ Rule for Many-Server Queues with Abandonment [J].

Atar, Rami ;

Giat, Chanit ;

Shimkin, Nahum .

OPERATIONS RESEARCH, 2010, 58 (05) :1427-1439

[4]

Avram F., 1994, STOCHASTIC NETWORKS, V71, P199

[5]

Bertsekas D., 2012, Dynamic Programming and Optimal Control

[6] The single-server scheduling problem with convex costs [J].

Bispo, Carlos F. .

QUEUEING SYSTEMS, 2013, 73 (03) :261-294

[7] User-level performance of channel-aware scheduling algorithms in wireless data networks [J].

Borst, S .

IEEE-ACM TRANSACTIONS ON NETWORKING, 2005, 13 (03) :636-647

[8] THE C-MU RULE REVISITED [J].

BUYUKKOC, C ;

VARAIYA, P ;

WALRAND, J .

ADVANCES IN APPLIED PROBABILITY, 1985, 17 (01) :237-238

[9] Dynamic control of a queue with adjustable service rate [J].

George, JM ;

Harrison, JM .

OPERATIONS RESEARCH, 2001, 49 (05) :720-731

[10]

Gittins J., 2011, Multi-armed bandit allocation indices

← 1 2 3 →