Approximately Stationary Bandits with Knapsacks

被引：0

作者：

Fikioris, Giannis ^{[1
]}

Tardos, Eva ^{[1
]}

机构：

[1] Cornell Univ, Ithaca, NY 14853 USA

来源：

THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195 | 2023年 / 195卷

关键词：

Bandits with Knapsacks; Best of both worlds; Adversarial; Approximately Stationary;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Bandits with Knapsacks (BwK), the generalization of the Multi-Armed Bandits problem under global budget constraints, has received a lot of attention in recent years. It has numerous applications, including dynamic pricing, repeated auctions, ad allocation, network scheduling, etc. Previous work has focused on one of the two extremes: Stochastic BwK where the rewards and consumptions of the resources of each round are sampled from an i.i.d. distribution, and Adversarial BwK where these parameters are picked by an adversary. Achievable guarantees in the two cases exhibit a massive gap: No-regret learning is achievable in the stochastic case, but in the adversarial case only competitive ratio style guarantees are achievable, where the competitive ratio depends either on the budget or on both the time and the number of resources. What makes this gap so vast is that in Adversarial BwK the guarantees get worse in the typical case when the budget is more binding. While "best-of-both-worlds" type algorithms are known (single algorithms that provide the best achievable guarantee in each extreme case), their bounds degrade to the adversarial case as soon as the environment is not fully stochastic. Our work aims to bridge this gap, offering guarantees for a workload that is not exactly stochastic but is also not worst-case. We define a condition, Approximately Stationary BwK, that parameterizes how close to stochastic or adversarial an instance is. Based on these parameters, we explore what is the best competitive ratio attainable in BwK. We explore two algorithms that are oblivious to the values of the parameters but guarantee competitive ratios that smoothly transition between the best possible guarantees in the two extreme cases, depending on the values of the parameters. Our guarantees offer great improvement over the adversarial guarantee, especially when the available budget is small. We also prove bounds on the achievable guarantee, showing that our results are approximately tight when the budget is small.

引用

页数：25

共 50 条

[1] Non-stationary Bandits with Knapsacks
Liu, Shang
Jiang, Jiashuo
Li, Xiaocheng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[2] Bandits with Knapsacks
Badanidiyuru, Ashwinkumar
Kleinberg, Robert
Slivkins, Aleksandrs
JOURNAL OF THE ACM, 2018, 65 (03)
[3] Bandits with Knapsacks and Predictions
Drago, Davide
Celli, Andrea
Elias, Marek
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2024, 244 : 1189 - 1206
[4] Adversarial Bandits with Knapsacks
Immorlica, Nicole
Sankararaman, Karthik Abinav
Schapire, Robert
Slivkins, Aleksandrs
2019 IEEE 60TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS 2019), 2019, : 202 - 219
[5] Adversarial Bandits with Knapsacks
Immorlica, Nicole
Sankararaman, Karthik
Schapire, Robert
Slivkins, Aleksandrs
JOURNAL OF THE ACM, 2022, 69 (06)
[6] Bandits with Knapsacks (Extended Abstract)
Badanidiyuru, Ashwinkumar
Kleinberg, Robert
Slivkins, Aleksandrs
2013 IEEE 54TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2013, : 207 - 216
[7] On Logarithmic Regret for Bandits with Knapsacks
Ren, Wenbo
Liu, Jia
Shroff, Ness B.
2021 55TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2021,
[8] Linear Contextual Bandits with Knapsacks
Agrawal, Shipra
Devanur, Nikhil R.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[9] Combinatorial Semi-Bandits with Knapsacks
Sankararaman, Karthik Abinav
Slivkins, Aleksandrs
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
[10] Bandits with Knapsacks beyond the Worst Case
Sankararaman, Karthik Abinav
Slivkins, Aleksandrs
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34

← 1 2 3 4 5 →