Noise Free Multi-armed Bandit Game

被引:1
作者
Nakamura, Atsuyoshi [1 ]
Helmbold, David P. [2 ]
Warmuth, Manfred K. [2 ]
机构
[1] Hokkaido Univ, Kita Ku, Kita 14,Nishi 9, Sapporo, Hokkaido 0600814, Japan
[2] Univ Calif Santa Cruz, Santa Cruz, CA 95064 USA
来源
LANGUAGE AND AUTOMATA THEORY AND APPLICATIONS, LATA 2016 | 2016年 / 9618卷
基金
美国国家科学基金会;
关键词
Algorithmic learning; Online learning; Bandit problem;
D O I
10.1007/978-3-319-30000-9_32
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the loss version of adversarial multi-armed bandit problems with one lossless arm. We show an adversary's strategy that forces any player to suffer K - 1 - O(1/T) loss where K is the number of arms and T is the number of rounds.
引用
收藏
页码:412 / 423
页数:12
相关论文
共 3 条
[1]  
Auer P, 2003, SIAM J COMPUT, V32, P48, DOI 10.1137/S0097539701398375
[2]   Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems [J].
Bubeck, Sebastien ;
Cesa-Bianchi, Nicolo .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2012, 5 (01) :1-122
[3]  
Cesa-Bianchi N., 2006, Prediction, learning, and games, DOI 10.1017/CBO9780511546921