Distributed Stochastic Gradient Tracking Algorithm With Variance Reduction for Non-Convex Optimization

被引：10

作者：

Jiang, Xia ^{[1
,2
]}

Zeng, Xianlin ^{[1
]}

Sun, Jian ^{[1
,2
]}

Chen, Jie ^{[1
,3
]}

机构：

[1] Beijing Inst Technol, Sch Automat, Key Lab Intelligent Control & Decis Complex Syst, Beijing 100081, Peoples R China

[2] Beijing Inst Technol Chongqing Innovat Ctr, Chongqing 401120, Peoples R China

[3] Tongji Univ, Sch Elect & Informat Engn, Shanghai 200082, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2023年 / 34卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Signal processing algorithms; Optimization; Convergence; Random variables; Distributed algorithms; Machine learning; Distributed databases; Complexity analysis; distributed algorithm; non-convex finite-sum optimization; stochastic gradient; variance reduction; TIME;

D O I：

10.1109/TNNLS.2022.3170944

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article proposes a distributed stochastic algorithm with variance reduction for general smooth non-convex finite-sum optimization, which has wide applications in signal processing and machine learning communities. In distributed setting, a large number of samples are allocated to multiple agents in the network. Each agent computes local stochastic gradient and communicates with its neighbors to seek for the global optimum. In this article, we develop a modified variance reduction technique to deal with the variance introduced by stochastic gradients. Combining gradient tracking and variance reduction techniques, this article proposes a distributed stochastic algorithm, gradient tracking algorithm with variance reduction (GT-VR), to solve large-scale non-convex finite-sum optimization over multiagent networks. A complete and rigorous proof shows that the GT-VR algorithm converges to the first-order stationary points with O(1/k) convergence rate. In addition, we provide the complexity analysis of the proposed algorithm. Compared with some existing first-order methods, the proposed algorithm has a lower O(PM epsilon b;(1)) gradient complexity under some mild condition. By comparing state-of-the-art algorithms and GT-VR in numerical simulations, we verify the efficiency of the proposed algorithm.

引用

页码：5310 / 5321

页数：12

共 47 条

[31]

Tang HL, 2018, PR MACH LEARN RES, V80

[32] Distributed Zero-Order Algorithms for Nonconvex Multiagent Optimization [J].

Tang, Yujie ;

Zhang, Junshan ;

Li, Na .

IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2021, 8 (01) :269-281

[33] Distributed Newton Method for Large-Scale Consensus Optimization [J].

Tutunov, Rasul ;

Bou-Ammar, Haitham ;

Jadbabaie, Ali .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (10) :3983-3994

[34] Push-Sum Distributed Online Optimization With Bandit Feedback [J].

Wang, Cong ;

Xu, Shengyuan ;

Yuan, Deming ;

Zhang, Baoyong ;

Zhang, Zhengqiang .

IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (04) :2263-2273

[35]

Xin R., IEEE T SIGNAL PROCES, V69, P2021

[36]

Xin R, 2021, PR MACH LEARN RES, V139

[37] A Fast Randomized Incremental Gradient Method for Decentralized Nonconvex Optimization [J].

Xin, Ran ;

Khan, Usman A. ;

Kar, Soummya .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (10) :5150-5165

[38] FAST DECENTRALIZED NONCONVEX FINITE-SUM OPTIMIZATION WITH RECURSIVE VARIANCE REDUCTION [J].

Xin, Ran ;

Khan, Usman A. ;

Kar, Soummya .

SIAM JOURNAL ON OPTIMIZATION, 2022, 32 (01) :1-28

[39] Variance-Reduced Decentralized Stochastic Optimization With Accelerated Convergence [J].

Xin, Ran ;

Khan, Usman A. ;

Kar, Soummya .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2020, 68 :6255-6271

[40] A General Framework for Decentralized Optimization With First-Order Methods [J].

Xin, Ran ;

Pu, Shi ;

Nedic, Angelia ;

Khan, Usman A. .

PROCEEDINGS OF THE IEEE, 2020, 108 (11) :1869-1889

← 1 2 3 4 5 →