Analysis of Privacy Leakage in Federated Large Language Models

被引:0
|
作者
Vu, Minh N. [1 ]
Nguyen, Truc [1 ]
Jeter, Tre' R. [1 ]
Thai, My T. [1 ]
机构
[1] Univ Florida, Gainesville, FL 32611 USA
来源
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238 | 2024年 / 238卷
基金
美国国家科学基金会;
关键词
MEMBERSHIP INFERENCE ATTACKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid adoption of Federated Learning (FL) as the training and tuning protocol for applications utilizing Large Language Models (LLMs), recent research highlights the need for significant modifications to FL to accommodate the large-scale of LLMs. While substantial adjustments to the protocol have been introduced as a response, comprehensive privacy analysis for the adapted FL protocol is currently lacking. To address this gap, our work delves into an extensive examination of the privacy analysis of FL when used for training LLMs, both from theoretical and practical perspectives. In particular, we design two active membership inference attacks with guaranteed theoretical success rates to assess the privacy leakages of various adapted FL configurations. Our theoretical findings are translated into practical attacks, revealing substantial privacy vulnerabilities in popular LLMs, including BERT, RoBERTa, DistilBERT, and OpenAI's GPTs, across multiple real-world language datasets. Additionally, we conduct thorough experiments to evaluate the privacy leakage of these models when data is protected by state-of-the-art differential privacy (DP) mechanisms.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] FedsLLM: Federated Split Learning for Large Language Models over Communication Networks
    Zhao, Kai
    Yang, Zhaohui
    Huang, Chongwen
    Chen, Xiaoming
    Zhang, Zhaoyang
    2024 INTERNATIONAL CONFERENCE ON UBIQUITOUS COMMUNICATION, UCOM 2024, 2024, : 438 - 443
  • [22] LLM-PBE: Assessing Data Privacy in Large Language Models
    Li, Qinbin
    Hong, Junyuan
    Xie, Chulin
    Tan, Jeffrey
    Xin, Rachel
    Hou, Junyi
    Yin, Xavier
    Wang, Zhun
    Hendrycks, Dan
    Wang, Zhangyang
    Li, Bo
    He, Bingsheng
    Song, Dawn
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (11): : 3201 - 3214
  • [23] Feasibility and Prospect of Privacy-preserving Large Language Models in Radiology
    Cai, Wenli
    RADIOLOGY, 2023, 309 (01)
  • [24] Auditing Privacy Defenses in Federated Learning via Generative Gradient Leakage
    Li, Zhuohang
    Zhang, Jiaxin
    Liu, Luyang
    Liu, Jian
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10122 - 10132
  • [25] Privacy Leakage from Logits Attack and Its Defense in Federated Distillation
    Xiao, Danyang
    Yang, Diying
    Li, Jialun
    Chen, Xu
    Wu, Weigang
    2024 54TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, DSN 2024, 2024, : 169 - 182
  • [26] Federated learning for privacy-preserving depression detection with multilingual language models in social media posts
    Khalil, Samar Samir
    Tawfik, Noha S.
    Spruit, Marco
    PATTERNS, 2024, 5 (07):
  • [27] Mitigating Privacy Seesaw in Large Language Models: Augmented Privacy Neuron Editing via Activation Patching
    Wu, Xinwei
    Dong, Weilong
    Xu, Shaoyang
    Xiong, Deyi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 5319 - 5332
  • [28] On Inter-Dataset Code Duplication and Data Leakage in Large Language Models
    Lopez, Jose Antonio Hernandez
    Chen, Boqi
    Saad, Mootez
    Sharma, Tushar
    Varro, Daniel
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2025, 51 (01) : 192 - 205
  • [29] Beyond Class-Level Privacy Leakage: Breaking Record-Level Privacy in Federated Learning
    Yuan, Xiaoyong
    Ma, Xiyao
    Zhang, Lan
    Fang, Yuguang
    Wu, Dapeng
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (04) : 2555 - 2565
  • [30] Privacy-preserving large language models for structured medical information retrieval
    Wiest, Isabella Catharina
    Ferber, Dyke
    Zhu, Jiefu
    van Treeck, Marko
    Meyer, Sonja K.
    Juglan, Radhika
    Carrero, Zunamys I.
    Paech, Daniel
    Kleesiek, Jens
    Ebert, Matthias P.
    Truhn, Daniel
    Kather, Jakob Nikolas
    NPJ DIGITAL MEDICINE, 2024, 7 (01):