Adaptive-treed bandits

被引：12

作者：

Bull, Adam D. ^{[1
]}

机构：

[1] Stat Lab, Cambridge CB3 0WB, England

来源：

BERNOULLI | 2015年 / 21卷 / 04期

基金：

英国工程与自然科学研究理事会;

关键词：

bandits on taxonomies; continuum-armed bandits; noisy global optimisation; tree-armed bandits; zooming dimension; MULTIARMED BANDITS;

D O I：

10.3150/14-BEJ644

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

We describe a novel algorithm for noisy global optimisation and continuum-armed bandits, with good convergence properties over any continuous reward function having finitely many polynomial maxima. Over such functions, our algorithm achieves square-root regret in bandits, and inverse-square-root error in optimisation, without prior information. Our algorithm works by reducing these problems to tree-armed bandits, and we also provide new results in this setting. We show it is possible to adaptively combine multiple trees so as to minimise the regret, and also give near-matching lower bounds on the regret in terms of the zooming dimension.

引用

页码：2289 / 2307

页数：19

共 23 条

[1] THE CONTINUUM-ARMED BANDIT PROBLEM
AGRAWAL, R
[J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1995, 33 (06) : 1926 - 1951
[2] Finite-time analysis of the multiarmed bandit problem
Auer, P
Cesa-Bianchi, N
Fischer, P
[J]. MACHINE LEARNING, 2002, 47 (2-3) : 235 - 256
[3] Improved rates for the stochastic continuum-armed bandit problem
Auer, Peter
Ortner, Ronald
Szepesvari, Csaba
[J]. LEARNING THEORY, PROCEEDINGS, 2007, 4539 : 454 - +
[4] Bubeck S., 2010, THESIS U LILLE 1, P1
[5] Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
Bubeck, Sebastien
Cesa-Bianchi, Nicolo
[J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2012, 5 (01): : 1 - 122
[6] Bubeck S, 2011, LECT NOTES ARTIF INT, V6925, P144, DOI 10.1007/978-3-642-24412-4_14
[7] Bubeck S, 2011, J MACH LEARN RES, V12, P1655
[8] Bubeck S, 2009, LECT NOTES ARTIF INT, V5809, P23, DOI 10.1007/978-3-642-04414-4_7
[9] Bull A.D., 2014, ADAPTIVE TREED BAN S, DOI [10.3150/14-BEJ644SUPP, DOI 10.3150/14-BEJ644SUPP]
[10] Regret and Convergence Bounds for a Class of Continuum-Armed Bandit Problems
Cope, Eric W.
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2009, 54 (06) : 1243 - 1253

← 1 2 3 →