Stochastic Multi-armed Bandits in Constant Space

被引：0

作者：

Liau, David ^{[1
]}

Price, Eric ^{[1
]}

Song, Zhao ^{[1
]}

Yang, Ger ^{[1
]}

机构：

[1] Univ Texas Austin, Austin, TX 78712 USA

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84 | 2018年 / 84卷

关键词：

EXPLORATION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider the stochastic bandit problem in the sublinear space setting, where one cannot record the win-loss record for all K arms. We give an algorithm using O(1) words of space with regret Sigma(K)(i=1) 1/Delta(i) log Delta(i)/Delta log T where Delta(i) is the gap between the best arm and arm i and Delta is the gap between the best and the second-best arms. If the rewards are bounded away from 0 and 1, this is within an O(log 1/Delta) factor of the optimum regret possible without space constraints.

引用

页数：9

共 50 条

[1] Stochastic Multi-Armed Bandits with Control Variates
Verma, Arun
Hanawal, Manjesh K.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
[2] Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions
Lancewicki, Tal
Segal, Shahar
Koren, Tomer
Mansour, Yishay
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[3] Robust Stochastic Multi-Armed Bandits with Historical Data
Yacobi, Sarah Boufelja
Bounefouf, Djallel
COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 959 - 965
[4] Networked Stochastic Multi-Armed Bandits with Combinatorial Strategies
Tang, Shaojie
Zhou, Yaqin
Han, Kai
Zhang, Zhao
Yuan, Jing
Wu, Weili
2017 IEEE 37TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2017), 2017, : 786 - 793
[5] Anytime optimal algorithms in stochastic multi-armed bandits
Degenne, Remy
Perchet, Vianney
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
[6] Parametrized Stochastic Multi-armed Bandits with Binary Rewards
Jiang, Chong
Srikant, R.
2011 AMERICAN CONTROL CONFERENCE, 2011, : 119 - 124
[7] Stealthy Adversarial Attacks on Stochastic Multi-Armed Bandits
Wang, Zhiwei
Wang, Huazheng
Wang, Hongning
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15770 - 15777
[8] Quantum Multi-Armed Bandits and Stochastic Linear Bandits Enjoy Logarithmic Regrets
Wan, Zongqi
Zhang, Zhijie
Li, Tongyang
Zhang, Jialin
Sun, Xiaoming
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 10087 - 10094
[9] On Kernelized Multi-armed Bandits
Chowdhury, Sayak Ray
Gopalan, Aditya
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[10] Regional Multi-Armed Bandits
Wang, Zhiyang
Zhou, Ruida
Shen, Cong
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84

← 1 2 3 4 5 →