Sample-Based and Feature-Based Federated Learning for Unconstrained and Constrained Nonconvex Optimization via Mini-batch SSCA

被引：4

作者：

Cui, Ying ^{[1
]}

Li, Yangchen ^{[1
]}

Ye, Chencheng ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China

来源：

IEEE TRANSACTIONS ON SIGNAL PROCESSING | 2022年 / 70卷

基金：

上海市自然科学基金; 国家重点研发计划;

关键词：

Optimization; Signal processing algorithms; Stochastic processes; Privacy; Approximation algorithms; Servers; Computational modeling; Federated learning; nonconvex optimization; stochastic optimization; stochastic successive convex approximation; GRADIENT DESCENT;

D O I：

10.1109/TSP.2022.3185895

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Federated learning (FL) has become a hot research area in enabling the collaborative training of machine learning models among multiple clients that hold sensitive local data. Nevertheless, unconstrained federated optimization has been studied mainly using stochastic gradient descent (SGD), which may converge slowly, and constrained federated optimization, which is more challenging, has not been investigated so far. This paper investigates sample-based and feature-based federated optimization, respectively, and considers both unconstrained and constrained nonconvex problems for each of them. First, we propose FL algorithms using stochastic successive convex approximation (SSCA) and mini-batch techniques. These algorithms can adequately exploit the structures of the objective and constraint functions and incrementally utilize samples. We show that the proposed FL algorithms converge to stationary points and Karush-Kuhn-Tucker (KKT) points of the respective unconstrained and constrained nonconvex problems, respectively. Next, we provide algorithm examples with appealing computational complexity and communication load per communication round. We show that the proposed algorithm examples for unconstrained federated optimization are identical to FL algorithms via momentum SGD and provide an analytical connection between SSCA and momentum SGD. Finally, numerical experiments demonstrate the inherent advantages of the proposed algorithms in convergence speeds, communication and computation costs, and model specifications.

引用

页码：3832 / 3847

页数：16

共 31 条

[1] Nonsmooth Optimization for Efficient Beamforming in Cognitive Radio Multicast Transmission [J].

Anh Huy Phan ;

Hoang Duong Tuan ;

Ha Hoang Kha ;

Duy Trong Ngo .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2012, 60 (06) :2941-2951

[2] Optimization Methods for Large-Scale Machine Learning [J].

Bottou, Leon ;

Curtis, Frank E. ;

Nocedal, Jorge .

SIAM REVIEW, 2018, 60 (02) :223-311

[3]

Chen T., 2020, PROC ICML WORKSHOP F

[4]

Cui C., GITHUB REPOSITORY

[5]

Cui Y, 2022, Arxiv, DOI arXiv:2104.06011

[6]

Di Lorenzo P, 2016, IEEE INT WORKS MACH

[7]

Foucart S., 2017, Bull. Am. Math., V54, P151

[8]

Hardy S, 2017, Arxiv, DOI arXiv:1711.10677

[9]

Koppel A, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P2771, DOI 10.1109/ICASSP.2018.8461449

[10]

Li M, 2014, ADV NEUR IN, V27

← 1 2 3 4 →