Distributed Stochastic Projection-Free Algorithm for Constrained Optimization

被引:2
作者
Jiang, Xia [1 ]
Zeng, Xianlin [2 ]
Xie, Lihua [4 ]
Sun, Jian [2 ,3 ]
Chen, Jie [5 ]
机构
[1] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong 999077, Peoples R China
[2] Beijing Inst Technol, Sch Automat, Natl Key Lab Autonomous Intelligent Unmanned Syst, Beijing 100081, Peoples R China
[3] Beijing Inst Technol Chongqing Innovat Ctr, Chongqing 401120, Peoples R China
[4] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[5] Natl Key Lab Autonomous Intelligent Unmanned Syst, Beijing 100081, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Optimization; Complexity theory; Convergence; Stochastic processes; Distributed algorithms; Vectors; Approximation algorithms; Minimization; Linear programming; Distributed databases; Distributed solver; finite-sum optimization; Frank-Wolfe (FW) algorithm; stochastic gradient; variance reduction; FRANK-WOLFE; GRADIENT METHODS; CONVEX; CONVERGENCE;
D O I
10.1109/TAC.2024.3481040
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article proposes a distributed stochastic projection-free algorithm for large-scale constrained finite-sum optimization whose constraint set is complicated such that the projection onto the constraint set can be expensive. The global cost function is allocated to multiple agents, each of which computes its local stochastic gradients and communicates with its neighbors to solve the global problem. Stochastic gradient methods enable low computational complexity, while they are hard and slow to converge due to the variance caused by random sampling. To construct a convergent distributed stochastic projection-free algorithm, this article incorporates variance reduction and gradient tracking techniques in the Frank-Wolfe (FW) update. We develop a novel sampling rule for the variance reduction technique to reduce the variance introduced by stochastic gradients. Complete and rigorous proofs show that the proposed distributed projection-free algorithm converges with a sublinear convergence rate and enjoys superior complexity guarantees for both convex and nonconvex objective functions. By comparative simulations, we demonstrate the convergence and computational efficiency of the proposed algorithm.
引用
收藏
页码:2479 / 2494
页数:16
相关论文
共 56 条
[1]  
Abu-Mostafa YS, 2012, LEARNING DATA SHORT
[2]  
Agarwal A, 2015, PR MACH LEARN RES, V37, P78
[3]  
Akhtar Z, 2021, P AMER CONTR CONF, P2619, DOI 10.23919/ACC50511.2021.9483167
[4]  
Beznosikov A., 2024, P 41 INT C MACH LEAR, P3725
[5]   A fully stochastic primal-dual algorithm [J].
Bianchi, Pascal ;
Hachem, Walid ;
Salim, Adil .
OPTIMIZATION LETTERS, 2021, 15 (02) :701-710
[6]   Convergence of a Multi-Agent Projected Stochastic Gradient Algorithm for Non-Convex Optimization [J].
Bianchi, Pascal ;
Jakubowicz, Jeremie .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2013, 58 (02) :391-405
[7]   A Distributed Optimization Algorithm for the Predictive Control of Smart Grids [J].
Braun, Philipp ;
Gruene, Lars ;
Kellett, Christopher M. ;
Weller, Steven R. ;
Worthmann, Karl .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (12) :3898-3911
[8]  
Chen JH, 2020, AAAI CONF ARTIF INTE, V34, P3486
[9]   Communication-Efficient Variance-Reduced Decentralized Stochastic Optimization Over Time-Varying Directed Graphs [J].
Chen, Yiyue ;
Hashemi, Abolfazl ;
Vikalo, Haris .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (12) :6583-6594
[10]   Tensor Decompositions for Signal Processing Applications [J].
Cichocki, Andrzej ;
Mandic, Danilo P. ;
Anh Huy Phan ;
Caiafa, Cesar F. ;
Zhou, Guoxu ;
Zhao, Qibin ;
De Lathauwer, Lieven .
IEEE SIGNAL PROCESSING MAGAZINE, 2015, 32 (02) :145-163