VF2Boost: Very Fast Vertical Federated Gradient Boosting for Cross-Enterprise Learning

被引：61

作者：

Fu, Fangcheng ^{[1
]}

Shao, Yingxia

Yu, Lele

Jiang, Jiawei

Xue, Huanran

Tao, Yangyu

Cui, Bin

机构：

[1] Peking Univ, Dept Comp Sci, Beijing, Peoples R China

来源：

SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA | 2021年

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Vertical Federated Learning; Gradient Boosting Decision Tree; PRIVACY; APPROXIMATION; REGRESSION; TREE;

D O I：

10.1145/3448016.3457241

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the ever-evolving concerns on privacy protection, vertical federated learning (FL), where participants own non-overlapping features for the same set of instances, is becoming a heated topic since it enables multiple enterprises to strengthen the machine learning models collaboratively with privacy guarantees. Nevertheless, to achieve privacy preservation, vertical FL algorithms involve complicated training routines and time-consuming cryptography operations, leading to slow training speed. This paper explores the efficiency of the gradient boosting decision tree (GBDT) algorithm under the vertical FL setting. Specifically, we introduce VF 2 Boost, a novel and efficient vertical federated GBDT system. Significant solutions are developed to tackle the major bottlenecks. First, to handle the deficiency caused by frequent mutual-waiting in federated training, we propose a concurrent training protocol to reduce the idle periods. Second, to speed up the cryptography operations, we analyze the characteristics of the algorithm and propose customized operations. Empirical results show that our system can be 12.8-18.9x faster than the existing vertical federated implementations and support much larger datasets.

引用

页码：563 / 576

页数：14

共 91 条

[1] Deep Learning with Differential Privacy [J].

Abadi, Martin ;

Chu, Andy ;

Goodfellow, Ian ;

McMahan, H. Brendan ;

Mironov, Ilya ;

Talwar, Kunal ;

Zhang, Li .

CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, :308-318

[2]

Abuzaid F., 2016, NIPS, P3810

[3]

Alistarh D, 2017, ADV NEUR IN, V30

[4]

[Anonymous], P 4 ANN S CLOUD COMP, DOI 10.1145/2523616.2523633

[5]

[Anonymous], 2012, IACR Cryptol. ePrint Arch.

[6]

[Anonymous], 2021, IEEE Trans. Broadcast.

[7] Scalable and Secure Logistic Regression via Homomorphic Encryption [J].

Aono, Yoshinori ;

Hayashi, Takuya ;

Le Trieu Phong ;

Wang, Lihua .

CODASPY'16: PROCEEDINGS OF THE SIXTH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, 2016, :142-144

[8]

Armknecht Frederik, 2015, IACR CRYPTOL EPRINT

[9]

Breiman L., 2017, Classification and regression trees, DOI [DOI 10.1201/9781315139470, DOI 10.1201/9781315139470-8]

[10]

Brendan McMahan H., 2018, INT C LEARN REPR

← 1 2 3 4 5 6 7 8 9 10 →