STOCHASTIC CONVEX OPTIMIZATION WITH BANDIT FEEDBACK

被引：39

作者：

Agarwal, Alekh ^{[1
]}

Foster, Dean P. ^{[2
]}

Hsu, Daniel ^{[3
]}

Kakade, Sham M. ^{[3
]}

Rakhlin, Alexander ^{[2
]}

机构：

[1] Microsoft Res, New York, NY 10016 USA

[2] Univ Penn, Dept Stat, Philadelphia, PA 19104 USA

[3] Microsoft Res, Cambridge, MA 02142 USA

来源：

SIAM JOURNAL ON OPTIMIZATION | 2013年 / 23卷 / 01期

基金：

美国国家科学基金会;

关键词：

derivative-free optimization; bandit optimization; ellipsoid method;

D O I：

10.1137/110850827

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

This paper addresses the problem of minimizing a convex, Lipschitz function f over a convex, compact set X under a stochastic bandit (i.e., noisy zeroth-order) feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value f(x) at any query point x is an element of X. The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm's query points minus the optimal function value. We demonstrate a generalization of the ellipsoid algorithm that incurs (O) over tilde (poly(d)root T) regret. Since any algorithm has regret at least Omega(root T) on this problem, our algorithm is optimal in terms of the scaling with T.

引用

页码：213 / 240

页数：28

共 20 条

[1] AGARWAL A., 2010, P COLT
[2] THE CONTINUUM-ARMED BANDIT PROBLEM
AGRAWAL, R
[J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1995, 33 (06) : 1926 - 1951
[3] Improved rates for the stochastic continuum-armed bandit problem
Auer, Peter
Ortner, Ronald
Szepesvari, Csaba
[J]. LEARNING THEORY, PROCEEDINGS, 2007, 4539 : 454 - +
[4] Solving convex programs by random walks
Bertsimas, D
Vempala, S
[J]. JOURNAL OF THE ACM, 2004, 51 (04) : 540 - 556
[5] Bubeck S, 2011, J MACH LEARN RES, V12, P1655
[6] Buldygin V. V., 1980, Ukr. Math. J, V32, P483, DOI [DOI 10.1007/BF01087176, 10.1007/BF01087176]
[7] Conn A. R., 2009, Introduction to Derivative-Free Optimization
[8] Regret and Convergence Bounds for a Class of Continuum-Armed Bandit Problems
Cope, Eric W.
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2009, 54 (06) : 1243 - 1253
[9] DANI V., 2008, P 21 ANN C LEARN THE
[10] Flaxman AD, 2005, PROCEEDINGS OF THE SIXTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P385

← 1 2 →