Privacy-Preserving Collaborative Deep Learning With Unreliable Participants

被引:162
作者
Zhao, Lingchen [1 ,2 ]
Wang, Qian [1 ,2 ]
Zou, Qin [3 ]
Zhang, Yan [3 ,4 ]
Chen, Yanjiao [3 ]
机构
[1] Wuhan Univ, Sch Cyber Sci & Engn, Wuhan 430072, Hubei, Peoples R China
[2] State Key Lab Cryptog, Beijing 100878, Peoples R China
[3] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Hubei, Peoples R China
[4] Huawei Technol Co Ltd, Shenzhen 518129, Guangdong, Peoples R China
关键词
Collaborative learning; deep learning; privacy; NEURAL-NETWORKS; SENSITIVITY; REGRESSION;
D O I
10.1109/TIFS.2019.2939713
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With powerful parallel computing GPUs and massive user data, neural-network-based deep learning can well exert its strong power in problem modeling and solving, and has archived great success in many applications such as image classification, speech recognition and machine translation etc. While deep learning has been increasingly popular, the problem of privacy leakage becomes more and more urgent. Given the fact that the training data may contain highly sensitive information, e.g., personal medical records, directly sharing them among the users (i.e., participants) or centrally storing them in one single location may pose a considerable threat to user privacy. In this paper, we present a practical privacy-preserving collaborative deep learning system that allows users to cooperatively build a collective deep learning model with data of all participants, without direct data sharing and central data storage. In our system, each participant trains a local model with their own data and only shares model parameters with the others. To further avoid potential privacy leakage from sharing model parameters, we use functional mechanism to perturb the objective function of the neural network in the training process to achieve epsilon-differential privacy. In particular, for the first time, we consider the existence of unreliable participants, i.e., the participants with low-quality data, and propose a solution to reduce the impact of these participants while protecting their privacy. We evaluate the performance of our system on two well-known real-world datasets for regression and classification tasks. The results demonstrate that the proposed system is robust against unreliable participants, and achieves high accuracy close to the model trained in a traditional centralized manner while ensuring rigorous privacy protection.
引用
收藏
页码:1486 / 1500
页数:15
相关论文
共 47 条
[1]   Deep Learning with Differential Privacy [J].
Abadi, Martin ;
Chu, Andy ;
Goodfellow, Ian ;
McMahan, H. Brendan ;
Mironov, Ilya ;
Talwar, Kunal ;
Zhang, Li .
CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, :308-318
[2]  
[Anonymous], 2016, General Data Protection Regulation (EU) 2016/679 (GDPR)
[3]  
[Anonymous], 2017, ARXIV170906161
[4]  
[Anonymous], 2005, P 11 ACM SIGKDD INT, DOI DOI 10.1145/1081870.1081942
[5]  
[Anonymous], 2003, Dover Books on Computer Science Series
[6]  
[Anonymous], DEEP LEARNING DIFFER
[7]  
Bhaskar R., 2010, P KDD 10 P 16 ACM SI, P503
[8]  
Camenisch Jan., 2006, ACM C COMPUTER COMMU, P201
[9]  
Chaudhuri A, 2009, PRODUCT RESEARCH: THE ART AND SCIENCE BEHIND SUCCESSFUL PRODUCT LAUNCHES, P289, DOI 10.1007/978-90-481-2860-0_16
[10]   Differentially Private High-Dimensional Data Publication via Sampling-Based Inference [J].
Chen, Rui ;
Xiao, Qian ;
Zhang, Yu ;
Xu, Jianliang .
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, :129-138