共 3 条
[1]
Auer P, 2003, SIAM J COMPUT, V32, P48, DOI 10.1137/S0097539701398375
[2]
Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
[J].
FOUNDATIONS AND TRENDS IN MACHINE LEARNING,
2012, 5 (01)
:1-122
[3]
Cesa-Bianchi N., 2006, Prediction, learning, and games, DOI 10.1017/CBO9780511546921