TITANIC: Towards Production Federated Learning with Large Language Models

被引:0
作者
Su, Ningxin [1 ]
Hu, Chenghao [1 ]
Li, Baochun [1 ]
Li, Bo [2 ]
机构
[1] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON, Canada
[2] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China
来源
IEEE INFOCOM 2024-IEEE CONFERENCE ON COMPUTER COMMUNICATIONS | 2024年
关键词
D O I
10.1109/INFOCOM52122.2024.10621164
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the recent surge of research interests in Large Language Models (LLMs), a natural question that arises is how pre-trained LLMs can be fine-tuned to tailor to specific needs of enterprises and individual users, while preserving the privacy of data used in the fine-tuning process. On the one hand, sending private data to cloud datacenters for fine-tuning is, without a doubt, unacceptable from a privacy perspective. On the other hand, conventional federated learning requires each client to perform local training, which is not feasible for LLMs with respect to both computation costs and communication overhead, since they involve billions of model parameters. In this paper, we present TITANIC, a new distributed training paradigm that allows LLMs to be fine-tuned in a privacy-preserving fashion directly on the client devices where private data is produced, while operating within the resource constraints on computation and communication bandwidth. TITANIC first optimally selects a subset of clients with an efficient solution to an integer optimization problem, then partitions an LLM across multiple client devices, and finally fine-tunes the model with no or minimal losses in training performance. A primary focus in the design of TITANIC is its feasibility in real-world systems: it is first and foremost designed for production-quality systems, featuring a model-agnostic partitioning mechanism that is fully automated. Our experimental results show that TITANIC achieves superior training performance as compared to conventional federated learning, while preserving data privacy and satisfying all constraints on local computation and bandwidth resources.
引用
收藏
页码:611 / 620
页数:10
相关论文
共 24 条
  • [1] Chu YH, 2000, PERF E R SI, V28, P1, DOI 10.1145/345063.339337
  • [2] Cohen B., 2003, P WORKSH EC PEER TO, V6, P68
  • [3] Distributed learning of deep neural network over multiple agents
    Gupta, Otkrist
    Raskar, Ramesh
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2018, 116 : 1 - 8
  • [4] Hu E. J., 2022, P INT C LEARN REPR
  • [5] Huang YP, 2019, ADV NEUR IN, V32
  • [6] HuggingFace, 2023, Open LLM Leaderboard
  • [7] Jennings C., 2023, WebRTC: Real-Time Communication in Browsers
  • [8] Lai F, 2021, PROCEEDINGS OF THE 15TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '21), P19
  • [9] Lalitha A, 2019, Arxiv, DOI arXiv:1901.11173
  • [10] Li Z., 2022, IEEE Transactions on Big Data, P1