Privacy Preserving Vertical Federated Learning for Tree-based Models

被引:91
作者
Wu, Yuncheng [1 ]
Cai, Shaofeng [1 ]
Xiao, Xiaokui [1 ]
Chen, Gang [2 ]
Ooi, Beng Chin [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2020年 / 13卷 / 11期
关键词
Learning systems - Privacy-preserving techniques;
D O I
10.14778/3407790.3407811
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Federated learning (FL) is an emerging paradigm that enables multiple organizations to jointly train a model without revealing their private data to each other. This paper studies vertical federated learning, which tackles the scenarios where (i) collaborating organizations own data of the same set of users but with disjoint features, and (ii) only one organization holds the labels. We propose Pivot, a novel solution for privacy preserving vertical decision tree training and prediction, ensuring that no intermediate information is disclosed other than those the clients have agreed to release (i.e., the final tree model and the prediction output). Pivot does not rely on any trusted third party and provides protection against a semi-honest adversary that may compromise m - 1 out of m clients. We further identify two privacy leakages when the trained decision tree model is released in plaintext and propose an enhanced protocol to mitigate them. The proposed solution can also be extended to tree ensemble models, e.g., random forest (RF) and gradient boosting decision tree (GBDT) by treating single decision trees as building blocks. Theoretical and experimental analysis suggest that Pivot is efficient for the privacy achieved.
引用
收藏
页码:2090 / 2103
页数:14
相关论文
共 72 条
[31]   Secure Multi-Party Functional Dependency Discovery [J].
Ge, Chang ;
Ilyas, Ihab F. ;
Kerschbaum, Florian .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 13 (02) :184-196
[32]  
Goldstick J., 2009, LECT NOTES DEP STAT
[33]  
Hart J. F., 1978, Computer Approximations
[34]  
Hay M, 2010, PROC VLDB ENDOW, V3, P1021
[35]   Channel Pruning for Accelerating Very Deep Neural Networks [J].
He, Yihui ;
Zhang, Xiangyu ;
Sun, Jian .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :1398-1406
[36]   Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning [J].
Hitaj, Briland ;
Ateniese, Giuseppe ;
Perez-Cruz, Fernando .
CCS'17: PROCEEDINGS OF THE 2017 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2017, :603-618
[37]   FDML: A Collaborative Machine Learning Framework for Distributed Features [J].
Hu, Yaochen ;
Niu, Di ;
Yang, Jianming ;
Zhou, Shengping .
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, :2232-2240
[38]   MASCOT: Faster Malicious Arithmetic Secure Computation with Oblivious Transfer [J].
Keller, Marcel ;
Orsini, Emmanuela ;
Scholl, Peter .
CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, :830-842
[39]  
Lindell Y, 2000, LECT NOTES COMPUT SC, V1880, P36
[40]  
Lindell Yehuda, 2008, Journal of Privacy and Confidentiality, V2008, P197, DOI DOI 10.29012/JPC.V1I1.566