Migrating federated learning to centralized learning with the leverage of unlabeled data

被引：2

作者：

Wang, Xiaoya ^{[1
]}

Zhu, Tianqing ^{[2
]}

Ren, Wei ^{[1
]}

Zhang, Dongmei ^{[1
]}

Xiong, Ping ^{[3
]}

机构：

[1] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Hubei, Peoples R China

[2] Univ Technol Sydney, Sch Comp Sci, Sydney, NSW 2007, Australia

[3] Zhongnan Univ Econ & Law, Sch Informat & Safety Engn, Wuhan 430073, Hubei, Peoples R China

来源：

KNOWLEDGE AND INFORMATION SYSTEMS | 2023年 / 65卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Federated learning; Non-IID; Teacher-student; Ensemble learning;

D O I：

10.1007/s10115-023-01869-8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Federated learning carries out cooperative training without local data sharing; the obtained global model performs generally better than independent local models. Benefiting from the free data sharing, federated learning preserves the privacy of local users. However, the performance of the global model might be degraded if diverse clients hold non-IID training data. This is because the different distributions of local data lead to weight divergence of local models. In this paper, we introduce a novel teacher-student framework to alleviate the negative impact of non-IID data. On the one hand, we maintain the advantage of the federated learning on the privacy-preserving, and on the other hand, we take the advantage of the centralized learning on the accuracy. We use unlabeled data and global models as teachers to generate a pseudo-labeled dataset, which can significantly improve the performance of the global model. At the same time, the global model as a teacher provides more accurate pseudo-labels. In addition, we perform a model rollback to mitigate the impact of latent noise labels and data imbalance in the pseudo-labeled dataset. Extensive experiments have verified that our teacher ensemble performs a more robust training. The empirical study verifies that the reliance on the centralized pseudo-labeled data enables the global model almost immune to non-IID data.

引用

页码：3725 / 3752

页数：28

共 40 条

[1]

Albaseer A, 2020, INT WIREL COMMUN, P1666, DOI 10.1109/IWCMC48107.2020.9148475

[2] Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning [J].

Arazo, Eric ;

Ortego, Diego ;

Albert, Paul ;

O'Connor, Noel E. ;

McGuinness, Kevin .

2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,

[3] Federated learning with hierarchical clustering of local updates to improve training on non-IID data [J].

Briggs, Christopher ;

Fan, Zhong ;

Andras, Peter .

2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,

[4]

Chen Hong-You, 2021, INT C LEARNING REPRE

[5]

Diao E., 2021, ARXIV

[6] A survey on ensemble learning [J].

Dong, Xibin ;

Yu, Zhiwen ;

Cao, Wenming ;

Shi, Yifan ;

Ma, Qianli .

FRONTIERS OF COMPUTER SCIENCE, 2020, 14 (02) :241-258

[7]

Guha N., 2019, ARXIV

[8] A novel federated learning approach based on the confidence of federated Kalman filters [J].

Hu, Kai ;

Wu, Jiasheng ;

Weng, Liguo ;

Zhang, Yanwen ;

Zheng, Fei ;

Pang, Zichao ;

Xia, Min .

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (12) :3607-3627

[9] Distillation-Based Semi-Supervised Federated Learning for Communication-Efficient Collaborative Training With Non-IID Private Data [J].

Itahara, Sohei ;

Nishio, Takayuki ;

Koda, Yusuke ;

Morikura, Masahiro ;

Yamamoto, Koji .

IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (01) :191-205

[10]

Jamali-Rad H, 2021, ARXIV

← 1 2 3 4 →