Smooth Online Learning is as Easy as Statistical Learning

被引：0

作者：

Block, Adam ^{[1
]}

Dagan, Yuval ^{[1
]}

Golowich, Noah ^{[1
]}

Rakhlin, Alexander ^{[1
]}

机构：

[1] MIT, Cambridge, MA 02139 USA

来源：

CONFERENCE ON LEARNING THEORY, VOL 178 | 2022年 / 178卷

基金：

美国国家科学基金会;

关键词：

Online Learning; Smoothed Analysis; Oracle Complexity; SEQUENTIAL COMPLEXITIES; ALGORITHMS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Much of modern learning theory has been split between two regimes: the classical offline setting, where data arrive independently, and the online setting, where data arrive adversarially. While the former model is often both computationally and statistically tractable, the latter requires no distributional assumptions. In an attempt to achieve the best of both worlds, previous work proposed the smooth online setting where each sample is drawn from an adversarially chosen distribution, which is smooth, i.e., it has a bounded density with respect to a fixed dominating measure. Existing results for the smooth setting were known only for binary-valued function classes and were computationally expensive in general; in this paper, we fill these lacunae. In particular, we provide tight bounds on the minimax regret of learning a nonparametric function class, with nearly optimal dependence on both the horizon and smoothness parameters. Furthermore, we provide the first oracle-efficient, no-regret algorithms in this setting. In particular, we propose an oracle-efficient improper algorithm whose regret achieves optimal dependence on the horizon and a proper algorithm requiring only a single oracle call per round whose regret has the optimal horizon dependence in the classification setting and is sublinear in general. Both algorithms have exponentially worse dependence on the smoothness parameter of the adversary than the minimax rate. We then prove a lower bound on the oracle complexity of any proper learning algorithm, which matches the oracle-efficient upper bounds up to a polynomial factor, thus demonstrating the existence of a statistical-computational gap in smooth online learning. Finally, we apply our results to the contextual bandit setting to show that if a function class is learnable in the classical setting, then there is an oracle-efficient, no-regret algorithm for contextual bandits in the case that contexts arrive in a smooth manner.

引用

页数：71

共 62 条

[1] Lower bounds for local search by quantum arguments [J].

Aaronson, S .

SIAM JOURNAL ON COMPUTING, 2006, 35 (04) :804-824

[2]

Abernethy Jacob, 2015, ARXIV

[3]

Abernethy Jacob, 2014, C LEARNING THEORY, P807

[4]

Agarwal A., 2011, Proc. Adv. Neural Inf. Process. Syst., P1035

[5]

Agarwal N, 2019, PR MACH LEARN RES, V99

[6] MINIMIZATION ALGORITHMS AND RANDOM-WALK ON THE D-CUBE [J].

ALDOUS, D .

ANNALS OF PROBABILITY, 1983, 11 (02) :403-413

[7]

Arora S, 2009, COMPUTATIONAL COMPLEXITY: A MODERN APPROACH, P1, DOI 10.1017/CBO9780511804090

[8]

Arthur D, 2006, ANN IEEE SYMP FOUND, P153

[9] Fat-shattering and the learnability of real-valued functions [J].

Bartlett, PL ;

Long, PM ;

Williamson, RC .

JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1996, 52 (03) :434-452

[10]

Beier Rene, 2004, P 36 ANN ACM S THEOR, P343

← 1 2 3 4 5 6 7 →