Migrating federated learning to centralized learning with the leverage of unlabeled data

被引：2

作者：

Wang, Xiaoya ^{[1
]}

Zhu, Tianqing ^{[2
]}

Ren, Wei ^{[1
]}

Zhang, Dongmei ^{[1
]}

Xiong, Ping ^{[3
]}

机构：

[1] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Hubei, Peoples R China

[2] Univ Technol Sydney, Sch Comp Sci, Sydney, NSW 2007, Australia

[3] Zhongnan Univ Econ & Law, Sch Informat & Safety Engn, Wuhan 430073, Hubei, Peoples R China

来源：

KNOWLEDGE AND INFORMATION SYSTEMS | 2023年 / 65卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Federated learning; Non-IID; Teacher-student; Ensemble learning;

D O I：

10.1007/s10115-023-01869-8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Federated learning carries out cooperative training without local data sharing; the obtained global model performs generally better than independent local models. Benefiting from the free data sharing, federated learning preserves the privacy of local users. However, the performance of the global model might be degraded if diverse clients hold non-IID training data. This is because the different distributions of local data lead to weight divergence of local models. In this paper, we introduce a novel teacher-student framework to alleviate the negative impact of non-IID data. On the one hand, we maintain the advantage of the federated learning on the privacy-preserving, and on the other hand, we take the advantage of the centralized learning on the accuracy. We use unlabeled data and global models as teachers to generate a pseudo-labeled dataset, which can significantly improve the performance of the global model. At the same time, the global model as a teacher provides more accurate pseudo-labels. In addition, we perform a model rollback to mitigate the impact of latent noise labels and data imbalance in the pseudo-labeled dataset. Extensive experiments have verified that our teacher ensemble performs a more robust training. The empirical study verifies that the reliance on the centralized pseudo-labeled data enables the global model almost immune to non-IID data.

引用

页码：3725 / 3752

页数：28

共 40 条

[31] From distributed machine learning to federated learning: In the view of data privacy and security [J].

Shen, Sheng ;

Zhu, Tianqing ;

Wu, Di ;

Wang, Wei ;

Zhou, Wanlei .

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (16)

[32]

Sohn Kihyuk, 2020, NeurIPS

[33] Data-Aware Device Scheduling for Federated Edge Learning [J].

Taik, Afaf ;

Mlika, Zoubeir ;

Cherkaoui, Soumaya .

IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2022, 8 (01) :408-421

[34]

Tarvainen A, 2017, ADV NEUR IN, V30

[35]

Wang H, 2020, IEEE INFOCOM SER, P1698, DOI [10.1109/INFOCOM41043.2020.9155494, 10.1109/infocom41043.2020.9155494]

[36] MAB-based Client Selection for Federated Learning with Uncertain Resources in Mobile Networks [J].

Yoshida, Naoya ;

Nishio, Takayuki ;

Morikura, Masahiro ;

Yamamoto, Koji .

2020 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2020,

[37] Fusing Wearable IMUs with Multi-View Images for Human Pose Estimation: A Geometric Approach [J].

Zhang, Zhe ;

Wang, Chunyu ;

Qin, Wenhu ;

Zeng, Wenjun .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2197-2206

[38]

Zhao Y., 2018, arXiv

[39] More Than Privacy: Applying Differential Privacy in Key Areas of Artificial Intelligence [J].

Zhu, Tianqing ;

Ye, Dayong ;

Wang, Wei ;

Zhou, Wanlei ;

Yu, Philip S. .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (06) :2824-2843

[40]

Zhu ZD, 2021, PR MACH LEARN RES, V139

← 1 2 3 4 →