Analysis of Privacy Leakage in Federated Large Language Models

被引：0

作者：

Vu, Minh N. ^{[1
]}

Nguyen, Truc ^{[1
]}

Jeter, Tre' R. ^{[1
]}

Thai, My T. ^{[1
]}

机构：

[1] Univ Florida, Gainesville, FL 32611 USA

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238 | 2024年 / 238卷

基金：

美国国家科学基金会;

关键词：

MEMBERSHIP INFERENCE ATTACKS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the rapid adoption of Federated Learning (FL) as the training and tuning protocol for applications utilizing Large Language Models (LLMs), recent research highlights the need for significant modifications to FL to accommodate the large-scale of LLMs. While substantial adjustments to the protocol have been introduced as a response, comprehensive privacy analysis for the adapted FL protocol is currently lacking. To address this gap, our work delves into an extensive examination of the privacy analysis of FL when used for training LLMs, both from theoretical and practical perspectives. In particular, we design two active membership inference attacks with guaranteed theoretical success rates to assess the privacy leakages of various adapted FL configurations. Our theoretical findings are translated into practical attacks, revealing substantial privacy vulnerabilities in popular LLMs, including BERT, RoBERTa, DistilBERT, and OpenAI's GPTs, across multiple real-world language datasets. Additionally, we conduct thorough experiments to evaluate the privacy leakage of these models when data is protected by state-of-the-art differential privacy (DP) mechanisms.

引用

页数：23

共 50 条

[31] Beyond Individual Concerns: Multi-user Privacy in Large Language Models
Zhan, Xiao
Seymour, William
Such, Jose
PROCEEDINGS OF THE 6TH CONFERENCE ON ACM CONVERSATIONAL USER INTERFACES, CUI 2024, 2024,
[32] Enhancing Privacy While Preserving Context in Text Transformations by Large Language Models
Zarski, Tymon Leslaw
Janicki, Artur
INFORMATION, 2025, 16 (01)
[33] Privacy preserving strategies for electronic health records in the era of large language models
Jonnagaddala, Jitendra
Wong, Zoie Shui-Yee
NPJ DIGITAL MEDICINE, 2025, 8 (01):
[34] Information-Theoretic Bounds on the Generalization Error and Privacy Leakage in Federated Learning
Yagli, Semih
Dytso, Alex
Poor, H. Vincent
PROCEEDINGS OF THE 21ST IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS (IEEE SPAWC2020), 2020,
[35] Trend Analysis Through Large Language Models
Alzapiedi, Lucas
Bihl, Trevor
IEEE NATIONAL AEROSPACE AND ELECTRONICS CONFERENCE, NAECON 2024, 2024, : 370 - 374
[36] Automated Topic Analysis with Large Language Models
Kirilenko, Andrei
Stepchenkova, Svetlana
INFORMATION AND COMMUNICATION TECHNOLOGIES IN TOURISM 2024, ENTER 2024, 2024, : 29 - 34
[37] Multimodal large language models for bioimage analysis
Zhang, Shanghang
Dai, Gaole
Huang, Tiejun
Chen, Jianxu
NATURE METHODS, 2024, 21 (08) : 1390 - 1393
[38] DP-GSGLD: A Bayesian optimizer inspired by differential privacy defending against privacy leakage in federated learning
Yang, Chengyi
Jia, Kun
Kong, Deli
Qi, Jiayin
Zhou, Aimin
COMPUTERS & SECURITY, 2024, 142
[39] Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization
Che, Tianshi
Liu, Ji
Zhou, Yang
Ren, Jiaxiang
Zhou, Jiwen
Sheng, Victor S.
Dai, Huaiyu
Dou, Dejing
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 7871 - 7888
[40] Selective privacy-preserving framework for large language models fine-tuning
Wang, Teng
Zhai, Lindong
Yang, Tengfei
Luo, Zhucheng
Liu, Shuanggen
INFORMATION SCIENCES, 2024, 678

← 1 2 3 4 5 →