FEDBERT: When Federated Learning Meets Pre-training

被引:67
作者
Tian, Yuanyishu [1 ]
Wan, Yao [1 ]
Lyu, Lingjuan [2 ]
Yao, Dezhong [1 ]
Jin, Hai [1 ]
Sun, Lichao [3 ]
机构
[1] Huazhong Univ Sci & Technol, Serv Comp Technol & Syst Lab, Natl Engn Res Ctr Big Data Technol & Syst, Sch Comp Sci & Technol,Cluster & Grid Comp Lab, 1037 Luoyu Rd, Wuhan 430074, Peoples R China
[2] Sony AI, Minato Ku, 1-7-1 Konan, Tokyo, Japan
[3] Lehigh Univ, 113 Res Dr, Bethlehem, PA 18015 USA
基金
中国国家自然科学基金;
关键词
Federated learning; pre-training; BERT; NLP;
D O I
10.1145/3510033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The fast growth of pre-trained models (PTMs) has brought natural language processing to a new era, which has become a dominant technique for various natural language processing (NLP) applications. Every user can download the weights of PTMs, then fine-tune the weights for a task on the local side. However, the pre-training of a model relies heavily on accessing a large-scale of training data and requires a vast amount of computing resources. These strict requirements make it impossible for any single client to pre-train such a model. To grant clients with limited computing capability to participate in pre-training a large model, we propose a new learning approach, FEDBERT, that takes advantage of the federated learning and split learning approaches, resorting to pre-training BERT in a federated way. FEDBERT can prevent sharing the raw data information and obtain excellent performance. Extensive experiments on seven GLUE tasks demonstrate that FEDBERT can maintain its effectiveness without communicating to the sensitive local data of clients.
引用
收藏
页数:26
相关论文
共 69 条
[1]  
Abedi Ali, 2020, ARXIV
[2]  
Abuadbba Sharif, 2020, ASIA CCS '20: Proceedings of the 15th ACM Asia Conference on Computer and Communications Security, P305, DOI 10.1145/3320269.3384740
[3]  
Beltagy I, 2019, ARXIV
[4]  
Bentivogli Luisa, 2009, P 2 TEXT AN C
[5]   Practical Secure Aggregation for Privacy-Preserving Machine Learning [J].
Bonawitz, Keith ;
Ivanov, Vladimir ;
Kreuter, Ben ;
Marcedone, Antonio ;
McMahan, H. Brendan ;
Patel, Sarvar ;
Ramage, Daniel ;
Segal, Aaron ;
Seth, Karn .
CCS'17: PROCEEDINGS OF THE 2017 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2017, :1175-1191
[6]  
Bonawitz Keith, 2019, P MACH LEARN SYST, V1, P374
[7]  
Ceballos Iker, 2020, ARXIV
[8]  
Clark K., 2020, 8 INT C LEARNING REP, DOI [DOI 10.48550/ARXIV.2003.10555, 10.48550/arXiv.2003.10555]
[9]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[10]  
Diab M., 2017, P 11 INT WORKSHOP SE, DOI DOI 10.18653/V1/S17-2001