On Projected Stochastic Gradient Descent Algorithm with Weighted Averaging for Least Squares Regression

被引：0

作者：

Cohen, Kobi ^{[1
]}

Nedic, Angelia ^{[2
]}

Srikant, R. ^{[2
]}

机构：

[1] Ben Gurion Univ Negev, Dept Elect & Comp Engn, POB 653, IL-84105 Beer Sheva, Israel

[2] Univ Illinois, Coordinated Sci Lab, Urbana, IL 61801 USA

来源：

2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS | 2016年

关键词：

Convex optimization; projected stochastic gradient descent; weighted averaging; empirical risk minimizer; APPROXIMATION;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The problem of least squares regression of a d-dimensional unknown parameter is considered. A stochastic gradient descent based algorithm with weighted iterate-averaging that uses a single pass over the data is studied and its convergence rate is analyzed. We first consider a bounded constraint set of the unknown parameter. Under some standard regularity assumptions, we provide an explicit O(1/k) upper bound on the convergence rate, depending on the variance (due to the additive noise in the measurements) and the size of the constraint set. We show that the variance term dominates the error and decreases with rate 1/k, while the constraint set term decreases with rate log k/k(2). We then compare the asymptotic ratio. rho between the convergence rate of the proposed scheme and the empirical risk minimizer (ERM) as the number of iterations approaches infinity. We show that. rho <= 4 under some mild conditions for all d >= 1. We further improve the upper bound by showing that. rho <= 4/3 for the case of d = 1 and unbounded parameter set. Simulation results demonstrate strong performance of the algorithm as compared to existing methods, and coincide with. rho <= 4/ 3 even for large d in practice.

引用

页码：2314 / 2318

页数：5

共 25 条

[1] On Projected Stochastic Gradient Descent Algorithm with Weighted Averaging for Least Squares Regression
Cohen, Kobi
Nedic, Angelia
Srikant, R.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (11) : 5974 - 5981
[2] Parallelizing Stochastic Gradient Descent for Least Squares Regression: Mini-batching, Averaging, and Model Misspecification
Jain, Prateek
Netrapalli, Praneeth
Kakade, Sham M.
Kidambi, Rahul
Sidford, Aaron
JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18
[3] Optimized convergence of stochastic gradient descent by weighted averaging
Hagedorn, Melinda
Jarre, Florian
OPTIMIZATION METHODS & SOFTWARE, 2024, 39 (04) : 699 - 724
[4] ON STOCHASTIC SUBGRADIENT MIRROR-DESCENT ALGORITHM WITH WEIGHTED AVERAGING
Nedic, Angelia
Lee, Soomin
SIAM JOURNAL ON OPTIMIZATION, 2014, 24 (01) : 84 - 107
[5] Constrained Stochastic Gradient Descent for Large-scale Least Squares Problem
Mu, Yang
Ding, Wei
Zhou, Tianyi
Tao, Dacheng
19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, : 883 - 891
[6] Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
Needell, Deanna
Srebro, Nathan
Ward, Rachel
MATHEMATICAL PROGRAMMING, 2016, 155 (1-2) : 549 - 573
[7] A robust weighted least squares support vector regression based on least trimmed squares
Chen, Chuanfa
Yan, Changqing
Li, Yanyan
NEUROCOMPUTING, 2015, 168 : 941 - 946
[8] Stochastic Gradient Descent Meets Distribution Regression
Muecke, Nicole
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[9] The Implicit Regularization of Stochastic Gradient Flow for Least Squares
Ali, Alnur
Dobriban, Edgar
Tibshirani, Ryan J.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[10] Central limit theorems for stochastic gradient descent with averaging for stable manifolds*
Dereich, Steffen
Kassing, Sebastian
ELECTRONIC JOURNAL OF PROBABILITY, 2023, 28

← 1 2 3 →