Federated Split BERT for Heterogeneous Text Classification

被引:3
|
作者
Li, Zhengyang [1 ]
Si, Shijing [1 ]
Wang, Jianzong [1 ]
Xiao, Jing [1 ]
机构
[1] Ping An Technol Shenzhen Co Ltd, Shenzhen, Peoples R China
来源
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2022年
关键词
Federated Learning; BERT; Data Heterogeneity; Quantization; Text Classification;
D O I
10.1109/IJCNN55064.2022.9892845
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained BERT models have achieved impressive performance in many natural language processing (NLP) tasks. However, in many real-world situations, textual data are usually decentralized over many clients and unable to be uploaded to a central server due to privacy protection and regulations. Federated learning (FL) enables multiple clients collaboratively to train a global model while keeping the local data privacy. A few researches have investigated BERT in federated learning setting, but the problem of performance loss caused by heterogeneous (e.g., non-IID) data over clients remain under-explored. To address this issue, we propose a framework, FedSplitBERT, which handles heterogeneous data and decreases the communication cost by splitting the BERT encoder layers into local part and global part. The local part parameters are trained by the local client only while the global part parameters are trained by aggregating gradients of multiple clients. Due to the sheer size of BERT, we explore a quantization method to further reduce the communication cost with minimal performance loss. Our framework is ready-to-use and compatible to many existing federated learning algorithms, including FedAvg, FedProx and FedAdam. Our experiments verify the effectiveness of the proposed framework, which outperforms baseline methods by a significant margin, while FedSplitBERT with quantization can reduce the communication cost by 11.9x.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] A Communication-Efficient Federated Text Classification Method Based on Parameter Pruning
    Huo, Zheng
    Fan, Yilin
    Huang, Yaxin
    MATHEMATICS, 2023, 11 (13)
  • [32] Personalized Federated Relation Classification over Heterogeneous Texts
    Pang, Ning
    Zhao, Xiang
    Zeng, Weixin
    Wang, Ji
    Xiao, Weidong
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 973 - 982
  • [33] Effectively Heterogeneous Federated Learning: A Pairing and Split Learning Based Approach
    Shen, Jinglong
    Wang, Xiucheng
    Cheng, Nan
    Ma, Longfei
    Zhou, Conghao
    Zhang, Yuan
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 5847 - 5852
  • [34] Mono vs Multilingual BERT for Hate Speech Detection and Text Classification: A Case Study in Marathi
    Velankar, Abhishek
    Patil, Hrushikesh
    Joshi, Raviraj
    ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, ANNPR 2022, 2023, 13739 : 121 - 128
  • [35] F3: Fair and Federated Face Attribute Classification with Heterogeneous Data
    Kanaparthy, Samhita
    Padala, Manisha
    Damle, Sankarshan
    Sarvadevabhatla, Ravi Kiran
    Gujar, Sujit
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT I, 2023, 13935 : 483 - 494
  • [36] Sensitive Data Detection and Classification in Spanish Clinical Text: Experiments with BERT
    Garcia-Pablos, Aitor
    Perez, Naiara
    Cuadros, Montse
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4486 - 4494
  • [37] A Multiscale Interactive Attention Short Text Classification Model Based on BERT
    Zhou, Lu
    Wang, Peng
    Zhang, Huijun
    Wu, Shengbo
    Zhang, Tao
    IEEE ACCESS, 2024, 12 : 160992 - 161001
  • [38] Three-Branch BERT-Based Text Classification Network for Gastroscopy Diagnosis Text
    Wang Z.
    Zheng X.
    Zhang J.
    Zhang M.
    International Journal of Crowd Science, 2024, 8 (01) : 56 - 63
  • [39] Research on Public Service Request Text Classification Based on BERT-BiLSTM-CNN Feature Fusion
    Xiong, Yunpeng
    Chen, Guolian
    Cao, Junkuo
    APPLIED SCIENCES-BASEL, 2024, 14 (14):
  • [40] Research on Internet Text Sentiment Classification Based on BERT and CNN-BiGRU
    Wei, Guoli
    2022 11TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS (ICCCAS 2022), 2022, : 285 - 289