Distributed Stochastic Optimization Under a General Variance Condition

被引:1
|
作者
Huang, Kun [1 ]
Li, Xiao [1 ]
Pu, Shi [1 ]
机构
[1] Chinese Univ Hong Kong, Sch Data Sci, Shenzhen CUHK Shenzhen, Shenzhen 518172, Peoples R China
基金
中国国家自然科学基金;
关键词
Optimization; Linear programming; Distributed databases; Gradient methods; Convergence; Complexity theory; Particle measurements; Distributed optimization; nonconvex optimization; stochastic optimization; LEARNING-BEHAVIOR; CONVERGENCE;
D O I
10.1109/TAC.2024.3393169
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Distributed stochastic optimization has drawn great attention recently due to its effectiveness in solving large-scale machine learning problems. Although numerous algorithms have been proposed and successfully applied to general practical problems, their theoretical guarantees mainly rely on certain boundedness conditions on the stochastic gradients, varying from uniform boundedness to the relaxed growth condition. In addition, how to characterize the data heterogeneity among the agents and its impacts on the algorithmic performance remains challenging. In light of such motivations, we revisit the classical federated averaging algorithm (McMahan et al., 2017) as well as the more recent SCAFFOLD method (Karimireddy et al., 2020) for solving the distributed stochastic optimization problem and establish the convergence results under only a mild variance condition on the stochastic gradients for smooth nonconvex objective functions. Almost sure convergence to a stationary point is also established under the condition. Moreover, we discuss a more informative measurement for data heterogeneity as well as its implications.
引用
收藏
页码:6105 / 6120
页数:16
相关论文
共 50 条
  • [41] A General Framework for Decentralized Optimization With First-Order Methods
    Xin, Ran
    Pu, Shi
    Nedic, Angelia
    Khan, Usman A.
    PROCEEDINGS OF THE IEEE, 2020, 108 (11) : 1869 - 1889
  • [42] VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning
    Shang, Fanhua
    Zhou, Kaiwen
    Liu, Hongying
    Cheng, James
    Tsang, Ivor W.
    Zhang, Lijun
    Tao, Dacheng
    Jiao, Licheng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (01) : 188 - 202
  • [43] Improving the Transient Times for Distributed Stochastic Gradient Methods
    Huang, Kun
    Pu, Shi
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (07) : 4127 - 4142
  • [44] Aggregating Stochastic Gradients in Distributed Optimization
    Doan, Thinh T.
    2018 ANNUAL AMERICAN CONTROL CONFERENCE (ACC), 2018, : 2170 - 2175
  • [45] On Distributed Nonconvex Optimization: Projected Subgradient Method for Weakly Convex Problems in Networks
    Chen, Shixiang
    Garcia, Alfredo
    Shahrampour, Shahin
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (02) : 662 - 675
  • [46] Distributed Nonconvex Optimization: Gradient-Free Iterations and ε-Globally Optimal Solution
    He, Zhiyu
    He, Jianping
    Chen, Cailian
    Guan, Xinping
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2024, 11 (04): : 2239 - 2251
  • [47] Distributed Asynchronous Constrained Stochastic Optimization
    Srivastava, Kunal
    Nedic, Angelia
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (04) : 772 - 790
  • [48] A stochastic gradient tracking algorithm with adaptive momentum for distributed optimization
    Li, Yantao
    Hu, Hanqing
    Zhang, Keke
    Lu, Qingguo
    Deng, Shaojiang
    Li, Huaqing
    NEUROCOMPUTING, 2025, 637
  • [49] Secure Distributed Optimization Under Gradient Attacks
    Yu, Shuhua
    Kar, Soummya
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2023, 71 : 1802 - 1816
  • [50] Quantized Zeroth-Order Gradient Tracking Algorithm for Distributed Nonconvex Optimization Under Polyak-Lojasiewicz Condition
    Xu, Lei
    Yi, Xinlei
    Deng, Chao
    Shi, Yang
    Chai, Tianyou
    Yang, Tao
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (10) : 5746 - 5758