Multi-Armed Bandits with Bounded Arm-Memory: Near-Optimal Guarantees for Best-Arm Identification and Regret Minimization

被引：0

作者：

Maiti, Arnab ^{[1
,3
]}

Patil, Vishakha ^{[2
]}

Khan, Arindam ^{[2
]}

机构：

[1] Indian Inst Technol, Kharagpur, W Bengal, India

[2] Indian Inst Sci, Bangalore, Karnataka, India

[3] IISc, Bangalore, Karnataka, India

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study the Stochastic Multi-armed Bandit problem under bounded arm-memory. In this setting, the arms arrive in a stream, and the number of arms that can be stored in the memory at any time, is bounded. The decision-maker can only pull arms that are present in the memory. We address the problem from the perspective of two standard objectives: 1) regret minimization, and 2) best-arm identification. For regret minimization, we settle an important open question by showing an almost tight guarantee. We show Omega(T-2/3) cumulative regret in expectation for single-pass algorithms for arm-memory size of (n = 1), where n is the number of arms. For best-arm identification, we provide an (epsilon, delta)-PAC algorithm with arm-memory size of O (log* n) and O(n/epsilon(2) . log(1/delta)) optimal sample complexity.

引用

页数：13

共 35 条

[21] Karlin A, 2015, AAAI CONF ARTIF INTE, P944
[22] Bandits and Experts in Metric Spaces
Kleinberg, Robert
Slivkins, Aleksandrs
Upfal, Eli
[J]. JOURNAL OF THE ACM, 2019, 66 (04)
[23] Liau D., 2018, International Conference on Artificial Intelligence and Statistics, V84, P386
[24] Long TT, 2014, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, P809
[25] Efficient crowdsourcing of unknown experts using bounded multi-armed bandits
Long Tran-Thanh
Stein, Sebastian
Rogers, Alex
Jennings, Nicholas R.
[J]. ARTIFICIAL INTELLIGENCE, 2014, 214 : 89 - 111
[26] Lu CJ, 2011, LECT NOTES ARTIF INT, V6925, P249, DOI 10.1007/978-3-642-24412-4_21
[27] Pekoz E. A., 2003, Journal of Applied Probability, V40, P250, DOI 10.1239/jap/1044476838
[28] Rai P, 2009, 21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, P1211
[29] SOME ASPECTS OF THE SEQUENTIAL DESIGN OF EXPERIMENTS
ROBBINS, H
[J]. BULLETIN OF THE AMERICAN MATHEMATICAL SOCIETY, 1952, 58 (05) : 527 - 535
[30] Introduction to Multi-Armed Bandits Preface
Slivkins, Aleksandrs
[J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2019, 12 (1-2): : 1 - 286

← 1 2 3 4 →