Subject-Level Membership Inference Attack via Data Augmentation and Model Discrepancy

被引:4
作者
Liu, Yimin [1 ,2 ]
Jiang, Peng [1 ]
Zhu, Liehuang [1 ]
机构
[1] Beijing Inst Technol, Sch Cyberspace Sci & Technol, Beijing 100081, Peoples R China
[2] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China
基金
北京市自然科学基金;
关键词
Data models; Training; Data privacy; Privacy; Distributed databases; Degradation; Data augmentation; Federated learning; subject-level membership inference attacks; privacy degradation; generative adversarial networks;
D O I
10.1109/TIFS.2023.3318950
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Federated learning (FL) models are vulnerable to membership inference attacks (MIAs), and the requirement of individual privacy motivates the protection of subjects where the individual data is distributed across multiple users in the cross-silo FL setting. In this paper, we propose a subject-level membership inference attack based on data augmentation and model discrepancy. It can effectively infer whether the data distribution of the target subject has been sampled and used for training by specific federated users, even if other users (also) may sample from the same subject and use it as part of their training set. Specifically, the adversary uses a generative adversarial network (GAN) to perform data augmentation on a small amount of priori federation-associated information known in advance. Subsequently, the adversary merges two different outputs from the global and tested user models using an optimal feature construction method. We simulate a controlled federation configuration and conduct extensive experiments on real datasets that include both image and categorical data. Results show that the area under the curve (AUC) is improved by 12.6% to 16.8% compared to the classical membership inference attack. This is at the expense of the test accuracy of the data augmented with GAN, which is at most 3.5% lower than the real test data. We also explore the degree of privacy leakage between overfitted models and well-generalized models in the cross-silo FL setting and conclude experimentally that the former is more likely to leak individual privacy with a subject-level degradation rate of up to 0.43. Finally, we present two possible defense mechanisms to attenuate this newly discovered privacy risk.
引用
收藏
页码:5848 / 5859
页数:12
相关论文
共 38 条
[1]   When Machine Unlearning Jeopardizes Privacy [J].
Chen, Min ;
Zhang, Zhikun ;
Wang, Tianhao ;
Backes, Michael ;
Humbert, Mathias ;
Zhang, Yang .
CCS '21: PROCEEDINGS OF THE 2021 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2021, :896-911
[2]   Calibrating noise to sensitivity in private data analysis [J].
Dwork, Cynthia ;
McSherry, Frank ;
Nissim, Kobbi ;
Smith, Adam .
THEORY OF CRYPTOGRAPHY, PROCEEDINGS, 2006, 3876 :265-284
[3]   The Algorithmic Foundations of Differential Privacy [J].
Dwork, Cynthia ;
Roth, Aaron .
FOUNDATIONS AND TRENDS IN THEORETICAL COMPUTER SCIENCE, 2013, 9 (3-4) :211-406
[4]   Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches [J].
Fernandez, Alberto ;
Lopez, Victoria ;
Galar, Mikel ;
Jose del Jesus, Maria ;
Herrera, Francisco .
KNOWLEDGE-BASED SYSTEMS, 2013, 42 :97-110
[5]  
Fu J, 2023, P NETW DISTR SYST SE, P1
[6]   Property Inference Attacks on Fully Connected Neural Networks using Permutation Invariant Representations [J].
Ganju, Karan ;
Wang, Qi ;
Yang, Wei ;
Gunter, Carl A. ;
Borisov, Nikita .
PROCEEDINGS OF THE 2018 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'18), 2018, :619-633
[7]  
Jayaraman Bargav, 2021, Proceedings on Privacy Enhancing Technologies, V2021, P348, DOI 10.2478/popets-2021-0031
[8]  
Jiarui Cao, 2022, ICMLC 2022: 2022 14th International Conference on Machine Learning and Computing (ICMLC), P18, DOI 10.1145/3529836.3529845
[9]   Advances and Open Problems in Federated Learning [J].
Kairouz, Peter ;
McMahan, H. Brendan ;
Avent, Brendan ;
Bellet, Aurelien ;
Bennis, Mehdi ;
Bhagoji, Arjun Nitin ;
Bonawitz, Kallista ;
Charles, Zachary ;
Cormode, Graham ;
Cummings, Rachel ;
D'Oliveira, Rafael G. L. ;
Eichner, Hubert ;
El Rouayheb, Salim ;
Evans, David ;
Gardner, Josh ;
Garrett, Zachary ;
Gascon, Adria ;
Ghazi, Badih ;
Gibbons, Phillip B. ;
Gruteser, Marco ;
Harchaoui, Zaid ;
He, Chaoyang ;
He, Lie ;
Huo, Zhouyuan ;
Hutchinson, Ben ;
Hsu, Justin ;
Jaggi, Martin ;
Javidi, Tara ;
Joshi, Gauri ;
Khodak, Mikhail ;
Konecny, Jakub ;
Korolova, Aleksandra ;
Koushanfar, Farinaz ;
Koyejo, Sanmi ;
Lepoint, Tancrede ;
Liu, Yang ;
Mittal, Prateek ;
Mohri, Mehryar ;
Nock, Richard ;
Ozgur, Ayfer ;
Pagh, Rasmus ;
Qi, Hang ;
Ramage, Daniel ;
Raskar, Ramesh ;
Raykova, Mariana ;
Song, Dawn ;
Song, Weikang ;
Stich, Sebastian U. ;
Sun, Ziteng ;
Suresh, Ananda Theertha .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2021, 14 (1-2) :1-210
[10]   Cross-Silo Process Mining with Federated Learning [J].
Khan, Asjad ;
Ghose, Aditya ;
Dam, Hoa .
SERVICE-ORIENTED COMPUTING (ICSOC 2021), 2021, 13121 :612-626