On Biased Stochastic Gradient Estimation

被引：0

作者：

Driggs, Derek ^{[1
]}

Liang, Jingwei ^{[2
,3
]}

Schonlieb, Carola-Bibiane ^{[1
]}

机构：

[1] Univ Cambridge, Dept Appl Math & Theoret Phys, Cambridge CB3 0WA, England

[2] Shanghai Jiao Tong Univ, Inst Nat Sci, Shanghai 200240, Peoples R China

[3] Shanghai Jiao Tong Univ, Sch Math Sci, Shanghai 200240, Peoples R China

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2022年 / 23卷

基金：

英国工程与自然科学研究理事会; 欧盟地平线“2020”;

关键词：

stochastic gradient descent; variance reduction; biased gradient estimation; OPTIMIZATION; ALGORITHM;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We present a uniform analysis of biased stochastic gradient methods for minimizing convex, strongly convex, and non-convex composite objectives, and identify settings where bias is useful in stochastic gradient estimation. The framework we present allows us to extend proximal support to biased algorithms, including SAG and SARAH, for the first time in the convex setting. We also use our framework to develop a new algorithm, Stochastic Average Recursive GradiEnt (SARGE), that achieves the oracle complexity lower-bound for nonconvex, finite-sum objectives and requires strictly fewer calls to a stochastic gradient oracle per iteration than SVRG and SARAH. We support our theoretical results with numerical experiments that demonstrate the benefits of certain biased gradient estimators.

引用

页数：43

共 50 条

[31] Adaptive Stochastic Gradient Descent (SGD) for erratic datasets
Dagal, Idriss
Tanrioven, Kursat
Nayir, Ahmet
Akin, Burak
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2025, 166
[32] Nested Distributed Gradient Methods with Stochastic Computation Errors
Iakovidou, Charikleia
Wei, Ermin
2019 57TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2019, : 339 - 346
[33] Stochastic Gradient Descent with Polyak’s Learning Rate
Mariana Prazeres
Adam M. Oberman
Journal of Scientific Computing, 2021, 89
[34] Distributed and asynchronous Stochastic Gradient Descent with variance reduction
Ming, Yuewei
Zhao, Yawei
Wu, Chengkun
Li, Kuan
Yin, Jianping
NEUROCOMPUTING, 2018, 281 : 27 - 36
[35] On Almost Sure Convergence Rates of Stochastic Gradient Methods
Liu, Jun
Yuan, Ye
CONFERENCE ON LEARNING THEORY, VOL 178, 2022, 178
[36] Adjusted stochastic gradient descent for latent factor analysis
Li, Qing
Xiong, Diwen
Shang, Mingsheng
INFORMATION SCIENCES, 2022, 588 : 196 - 213
[37] Stochastic Gradient Descent with Polyak's Learning Rate
Prazeres, Mariana
Oberman, Adam M.
JOURNAL OF SCIENTIFIC COMPUTING, 2021, 89 (01)
[38] Stochastic gradient descent for semilinear elliptic equations with uncertainties
Wang, Ting
Knap, Jaroslaw
JOURNAL OF COMPUTATIONAL PHYSICS, 2021, 426
[39] Katyusha: The First Direct Acceleration of Stochastic Gradient Methods
Allen-Zhu, Zeyuan
STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2017, : 1200 - 1205
[40] SAAGs: Biased stochastic variance reduction methods for large-scale learning
Vinod Kumar Chauhan
Anuj Sharma
Kalpana Dahiya
Applied Intelligence, 2019, 49 : 3331 - 3361

← 1 2 3 4 5 →