Distributed SGD in overparametrized linear regression

被引:1
作者
Nguyen, Mike [1 ]
Kirst, Charly [1 ]
Muecke, Nicole [1 ]
机构
[1] Tech Univ Carolo Wilhelmina Braunschweig, D-38106 Braunschweig, Germany
关键词
Local SGD; overparametrization; distributed learning; PREDICTION; MODELS; RATES;
D O I
10.1142/S021953052350032X
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We consider distributed learning using constant stepsize stochastic gradient descent (DSGD). The data are distributed uniformly over several devices, each sending a final model update to a central server. In a final step, the local estimates are aggregated. We prove in the setting of overparametrized linear regression general upper bounds with matching lower bounds and derive learning rates for specific data generating distributions. We show that the excess risk is of order of the variance provided the number of local nodes grows not too large with the global sample size. We further compare the sample complexity of DSGD with the sample complexity of distributed ridge regression (DRR) and show that the excess DSGD-risk is smaller than the excess DRR-risk, where both sample complexities are of the same order.
引用
收藏
页码:425 / 466
页数:42
相关论文
共 33 条
  • [1] [Anonymous], 2011, Advances in neural information processing systems
  • [2] Bao YJ, 2021, PR MACH LEARN RES, V130, P46
  • [3] Benign overfitting in linear regression
    Bartlett, Peter L.
    Long, Philip M.
    Lugosi, Gabor
    Tsigler, Alexander
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (48) : 30063 - 30070
  • [4] DISTRIBUTED TESTING AND ESTIMATION UNDER SPARSE HIGH DIMENSIONAL MODELS
    Battey, Heather
    Fan, Jianqing
    Liu, Han
    Lu, Junwei
    Zhu, Ziwei
    [J]. ANNALS OF STATISTICS, 2018, 46 (03) : 1352 - 1382
  • [5] Chang XY, 2017, J MACH LEARN RES, V18
  • [6] A SPLIT-AND-CONQUER APPROACH FOR ANALYSIS OF EXTRAORDINARILY LARGE DATA
    Chen, Xueying
    Xie, Min-ge
    [J]. STATISTICA SINICA, 2014, 24 (04) : 1655 - 1684
  • [7] Dieuleveut A., 2019, Advances in Neural Information Processing Systems, P2825
  • [8] NONPARAMETRIC STOCHASTIC APPROXIMATION WITH LARGE STEP-SIZES
    Dieuleveut, Aymeric
    Bach, Francis
    [J]. ANNALS OF STATISTICS, 2016, 44 (04) : 1363 - 1399
  • [9] Dobriban E, 2020, Proc. Mach. Learn. Res, V119, P8763
  • [10] Communication-Efficient Accurate Statistical Estimation
    Fan, Jianqing
    Guo, Yongyi
    Wang, Kaizheng
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (542) : 1000 - 1010