Adaptive-treed bandits

被引:12
作者
Bull, Adam D. [1 ]
机构
[1] Stat Lab, Cambridge CB3 0WB, England
基金
英国工程与自然科学研究理事会;
关键词
bandits on taxonomies; continuum-armed bandits; noisy global optimisation; tree-armed bandits; zooming dimension; MULTIARMED BANDITS;
D O I
10.3150/14-BEJ644
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We describe a novel algorithm for noisy global optimisation and continuum-armed bandits, with good convergence properties over any continuous reward function having finitely many polynomial maxima. Over such functions, our algorithm achieves square-root regret in bandits, and inverse-square-root error in optimisation, without prior information. Our algorithm works by reducing these problems to tree-armed bandits, and we also provide new results in this setting. We show it is possible to adaptively combine multiple trees so as to minimise the regret, and also give near-matching lower bounds on the regret in terms of the zooming dimension.
引用
收藏
页码:2289 / 2307
页数:19
相关论文
共 23 条
  • [1] THE CONTINUUM-ARMED BANDIT PROBLEM
    AGRAWAL, R
    [J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1995, 33 (06) : 1926 - 1951
  • [2] Finite-time analysis of the multiarmed bandit problem
    Auer, P
    Cesa-Bianchi, N
    Fischer, P
    [J]. MACHINE LEARNING, 2002, 47 (2-3) : 235 - 256
  • [3] Improved rates for the stochastic continuum-armed bandit problem
    Auer, Peter
    Ortner, Ronald
    Szepesvari, Csaba
    [J]. LEARNING THEORY, PROCEEDINGS, 2007, 4539 : 454 - +
  • [4] Bubeck S., 2010, THESIS U LILLE 1, P1
  • [5] Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
    Bubeck, Sebastien
    Cesa-Bianchi, Nicolo
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2012, 5 (01): : 1 - 122
  • [6] Bubeck S, 2011, LECT NOTES ARTIF INT, V6925, P144, DOI 10.1007/978-3-642-24412-4_14
  • [7] Bubeck S, 2011, J MACH LEARN RES, V12, P1655
  • [8] Bubeck S, 2009, LECT NOTES ARTIF INT, V5809, P23, DOI 10.1007/978-3-642-04414-4_7
  • [9] Bull A.D., 2014, ADAPTIVE TREED BAN S, DOI [10.3150/14-BEJ644SUPP, DOI 10.3150/14-BEJ644SUPP]
  • [10] Regret and Convergence Bounds for a Class of Continuum-Armed Bandit Problems
    Cope, Eric W.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2009, 54 (06) : 1243 - 1253