Communication-Efficient Vertical Federated Learning via Compressed Error Feedback

被引:0
作者
Valdeira, Pedro [1 ,2 ,3 ]
Xavier, Joao [2 ]
Soares, Claudia [4 ]
Chi, Yuejie [1 ]
机构
[1] Carnegie Mellon Univ, Dept Elect & Comp Engn, Pittsburgh, PA 15213 USA
[2] Univ Lisbon, Inst Super Tecn, P-1049001 Lisbon, Portugal
[3] Inst Syst & Robot, Lab Robot & Engn Syst, P-1600011 Lisbon, Portugal
[4] Univ Nova Lisboa, NOVA Sch Sci & Technol, Dept Comp Sci, P-2829516 Caparica, Portugal
基金
美国国家科学基金会;
关键词
Servers; Compressors; Training; Convergence; Vectors; Federated learning; Receivers; Optimization methods; Electronic mail; Data models; Vertical federated learning; nonconvex optimization; communication-compressed optimization; QUANTIZATION;
D O I
10.1109/TSP.2025.3540655
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Communication overhead is a known bottleneck in federated learning (FL). To address this, lossy compression is commonly used on the information communicated between the server and clients during training. In horizontal FL, where each client holds a subset of the samples, such communication-compressed training methods have recently seen significant progress. However, in their vertical FL counterparts, where each client holds a subset of the features, our understanding remains limited. To address this, we propose an error feedback compressed vertical federated learning (EF-VFL) method to train split neural networks. In contrast to previous communication-compressed methods for vertical FL, EF-VFL does not require a vanishing compression error for the gradient norm to converge to zero for smooth nonconvex problems. By leveraging error feedback, our method can achieve a O(1/T) convergence rate for a sufficiently large batch size, improving over the state-of-the-art O(1/root T) rate under O(1/root T) compression error, and matching the rate of uncompressed methods. Further, when the objective function satisfies the Polyak-& Lstrok;ojasiewicz inequality, our method converges linearly. In addition to improving convergence, our method also supports the use of private labels. Numerical experiments show that EF-VFL significantly improves over the prior art, confirming our theoretical results.
引用
收藏
页码:1065 / 1080
页数:16
相关论文
共 47 条
  • [1] Valdeira P., Xavier J., Soares C., Chi Y., Communication-efficient vertical federated learning via compressed error feedback, Proc. 32nd Eur. Signal Process. Conf. (EUSIPCO), pp. 1037-1041, (2024)
  • [2] McMahan B., Moore E., Ramage D., Hampson S., Arcas B.A.Y., Communication-efficient learning of deep networks from decentralized data, Proc. Artif. Intell. Statist., PMLR, pp. 1273-1282, (2017)
  • [3] Zhang X., Hong M., Dhople S., Yin W., Liu Y., FedPD: A federated learning framework with adaptivity to non-IID data, IEEE Trans. Signal Process., 69, pp. 6055-6070, (2021)
  • [4] Sery T., Shlezinger N., Cohen K., Eldar Y.C., Over-the-air federated learning from heterogeneous data, IEEE Trans. Signal Process., 69, pp. 3796-3811, (2021)
  • [5] Liu Y., Et al., Vertical federated learning: Concepts, advances, and challenges, IEEE Trans. Knowl. Data Eng., 36, 7, pp. 3615-3634, (2024)
  • [6] Cheng Y., Liu Y., Chen T., Yang Q., Federated learning for privacy-preserving AI, Commun. ACM, 63, 12, pp. 33-36, (2020)
  • [7] Ceballos I., Et al., SplitNN-driven vertical partitioning, (2020)
  • [8] Dean J., Et al., Large scale distributed deep networks, Proc. Adv. Neural Inf. Process. Syst., 25, pp. 1223-1231, (2012)
  • [9] Lian X., Zhang C., Zhang H., Hsieh C.-J., Zhang W., Liu J., Can decentralized algorithms outperform centralized algorithms? A case study for decentralized parallel stochastic gradient descent, Proc. Adv. Neural Inf. Process. Syst., 30, pp. 5336-5346, (2017)
  • [10] Seide F., Fu H., Droppo J., Li G., Yu D., 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs, Proc. 15th Annu. Conf. Int. Speech Commun. Assoc., pp. 1058-1062, (2014)