ON THE CONVERGENCE OF DECENTRALIZED GRADIENT DESCENT

被引：401

作者：

Yuan, Kun ^{[1
]}

Ling, Qing ^{[1
]}

Yin, Wotao ^{[2
]}

机构：

[1] Univ Sci & Technol China, Dept Automat, Hefei 230026, Anhui, Peoples R China

[2] Univ Calif Los Angeles, Dept Math, Los Angeles, CA 90095 USA

来源：

SIAM JOURNAL ON OPTIMIZATION | 2016年 / 26卷 / 03期

基金：

美国国家科学基金会;

关键词：

decentralized; distributed; consensus; optimization; gradient descent; OPTIMIZATION; CONSENSUS; ALGORITHMS; NETWORKS;

D O I：

10.1137/130943170

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

Consider the consensus problem of minimizing f(x) = Sigma(n)(i=1) fi(x), where x is an element of R-p and each f(i) is only known to the individual agent i in a connected network of n agents. To solve this problem and obtain the solution, all the agents collaborate with their neighbors through information exchange. This type of decentralized computation does not need a fusion center, offers better network load balance, and improves data privacy. This paper studies the decentralized gradient descent method [A. Nedic and A. Ozdaglar, IEEE Trans. Automat. Control, 54 (2009), pp. 48-61], in which each agent i updates its local variable x((i)) is an element of R-n by combining the average of its neighbors' with a local negative-gradient step -alpha del f(i)(x((i))). The method is described by the iteration x((i)) (k + 1) <- Sigma(n)(j=1) w(ij)x((j))(k)-alpha del f(i)(x((i))(k)), for each agent i, where w(ij) is nonzero only if i and j are neighbors or i = j and the matrix W = [w(ij)] is an element of R-n (x) (n) is symmetric and doubly stochastic. This paper analyzes the convergence of this iteration and derives its rate of convergence under the assumption that each f(i) is proper closed convex and lower bounded, del f(i) is Lipschitz continuous with constant L-fi > 0, and the stepsize a is fixed. Provided that alpha <= min{(1 + lambda(n)) (W))/L-h, 1/L-(f) over bar}, where L-h = max(i){L-fi} and L-(f) over bar = (1)(n) Sigma(n)(i=1) L-fi, the objective errors of all the local solutions and the networkwide mean solution reduce at rates of O(1/k) until they reach a level of O(alpha). If fi are strongly convex with modulus mu(fi), and alpha <= min{(1+lambda(n)(W))/L-h, 1/(L-f+mu(f))}, where mu(f) - 1/n Sigma(n)(i=1) mu(fi), then all the local solutions and the mean solution converge to the global minimizer x* at a linear rate until reaching an O(alpha)-neighborhood of x*. We also develop an iteration for decentralized basis pursuit and establish its linear convergence to an O(alpha)-neighborhood of the true sparse signal. This analysis reveals how the convergence of x((i))(k + 1) <- Sigma(n)(j=1) wijx(j)(k) - alpha del f(i)(x((i))(k)), for each agent i, depends on the stepsize, function convexity, and network spectrum.

引用

页码：1835 / 1854

页数：20

共 39 条

[1] [Anonymous], 2011, CONVEX ANAL MONOTONE, DOI [10.1007/978-1-4419-9467-7, DOI 10.1007/978-1-4419-9467-7]
[2] [Anonymous], 1984, Technical report
[3] Group-Lasso on Splines for Spectrum Cartography
Bazerque, Juan Andres
Mateos, Gonzalo
Giannakis, Georgios B.
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2011, 59 (10) : 4648 - 4663
[4] Distributed Spectrum Sensing for Cognitive Radio Networks by Exploiting Sparsity
Bazerque, Juan Andres
Giannakis, Georgios B.
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2010, 58 (03) : 1847 - 1862
[5] Boyd S, 2004, SIAM REV, V46, P667, DOI [10.1137/S0036144503423264, 10.1137/s0036144503423264]
[6] An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination
Cao, Yongcan
Yu, Wenwu
Ren, Wei
Chen, Guanrong
[J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2013, 9 (01) : 427 - 438
[7] A Distributed Subgradient Method for Dynamic Convex Optimization Problems Under Noisy Information Exchange
Cavalcante, Renato L. G.
Stanczak, Slawomir
[J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2013, 7 (02) : 243 - 256
[8] Diffusion Adaptation Strategies for Distributed Optimization and Learning Over Networks
Chen, Jianshu
Sayed, Ali H.
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2012, 60 (08) : 4289 - 4305
[9] Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling
Duchi, John C.
Agarwal, Alekh
Wainwright, Martin J.
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2012, 57 (03) : 592 - 606
[10] EXACT REGULARIZATION OF CONVEX PROGRAMS
Friedlander, Michael P.
Tseng, Paul
[J]. SIAM JOURNAL ON OPTIMIZATION, 2008, 18 (04) : 1326 - 1350

← 1 2 3 4 →