Analysis of biased stochastic gradient descent using sequential semidefinite programs

被引：0

作者：

Bin Hu

Peter Seiler

Laurent Lessard

机构：

[1] University of Illinois at Urbana–Champaign,Department of Electrical and Computer Engineering

[2] University of Michigan,Department of Electrical Engineering and Computer Science

[3] University of Wisconsin–Madison,Department of Electrical and Computer Engineering, Wisconsin Institute for Discovery

来源：

Mathematical Programming | 2021年 / 187卷

关键词：

Biased stochastic gradient; Robustness to inexact gradient; Convergence rates; Convex optimization; First-order methods; 90C06; 90C15; 90C25; 90C30; 68Q25; 62L20;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We present a convergence rate analysis for biased stochastic gradient descent (SGD), where individual gradient updates are corrupted by computation errors. We develop stochastic quadratic constraints to formulate a small linear matrix inequality (LMI) whose feasible points lead to convergence bounds of biased SGD. Based on this LMI condition, we develop a sequential minimization approach to analyze the intricate trade-offs that couple stepsize selection, convergence rate, optimization accuracy, and robustness to gradient inaccuracy. We also provide feasible points for this LMI and obtain theoretical formulas that quantify the convergence properties of biased SGD under various assumptions on the loss functions.

引用

页码：383 / 408

页数：25

共 37 条

[1]

Agarwal A(2012)Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization IEEE Trans. Inf. Theory 58 3235-3249

[2]

Bartlett PL(2018)Optimization methods for large-scale machine learning SIAM Rev. 60 223-311

[3]

Ravikumar P(2004)Large scale online learning Adv. Neural Inf. Process. Syst. 16 217-357

[4]

Wainwright MJ(2015)Convex optimization: algorithms and complexity Found. Trends® Mach. Learn. 8 231-1183

[5]

Bottou L(2008)Smooth optimization with approximate gradient SIAM J. Optim. 19 1171-1199

[6]

Curtis F(2017)On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions Optim. Lett. 11 1185-75

[7]

Nocedal J(2014)First-order methods of smooth convex optimization with inexact oracle Math. Program. 146 37-482

[8]

Bottou L(2014)Performance of first-order methods for smooth convex minimization: a novel approach Math. Program. 145 451-95

[9]

LeCun Y(2016)Analysis and design of optimization algorithms via integral quadratic constraints SIAM J. Optim. 26 57-407

[10]

Bubeck S(1951)A stochastic approximation method Ann. Math. Stat. 22 400-112

← 1 2 3 4 →