Two-stage sampling with predicted distribution changes in federated semi-supervised learning

被引:5
作者
Zhu, Suxia [1 ,2 ]
Ma, Xing [1 ,2 ]
Sun, Guanglu [1 ,2 ]
机构
[1] Harbin Univ Sci & Technol, Sch Comp Sci & Technol, Harbin 150080, Peoples R China
[2] Harbin Univ Sci & Technol, Res Ctr Informat Secur & Intelligent Technol, Harbin 150080, Peoples R China
关键词
Semi-supervised learning; Federated learning; Data augmentation;
D O I
10.1016/j.knosys.2024.111822
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Federated semi -supervised learning ( FSSL ) involves training a model in a federated environment using a few labeled samples and many unlabeled samples. Compared with semi -supervised learning, FSSL faces more complex data situations, especially when data are non -independently and identically distributed ( non-IID adding more challenges to the learning process. The previous method addresses the aforementioned issues enlarging the training sample space through multiple client random sampling and reweighting the parameters. Although it achieves high accuracy, it sacrifices communication efficiency. In this study, we propose PDCFed, a two -stage sampling method that uses the P redicted D istribution C hanges of samples after different data augmentations. We evaluate the credibility of the samples based on the maximum probability predicted weak augmentation. When the samples are in a less reliable space, they are further sampled after adjusting the predicted distribution changes using a Gaussian function. To enhance the model's generalization ability, an entropy penalty term is incorporated after unsupervised training loss. Extensive experiments demonstrate that this method outperforms existing methods on three datasets with non-IID data and significantly improves communication efficiency.
引用
收藏
页数:11
相关论文
共 43 条
[1]   Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning [J].
Arazo, Eric ;
Ortego, Diego ;
Albert, Paul ;
O'Connor, Noel E. ;
McGuinness, Kevin .
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[2]  
Bai SK, 2024, Arxiv, DOI arXiv:2307.05358
[3]  
Berthelot D, 2019, ADV NEUR IN, V32
[4]  
Cascante-Bonilla P, 2021, AAAI CONF ARTIF INTE, V35, P6912
[5]   FedTriNet: A Pseudo Labeling Method with Three Players for Federated Semi-supervised Learning [J].
Che, Liwei ;
Long, Zewei ;
Wang, Jiaqi ;
Wang, Yaqing ;
Xiao, Houping ;
Ma, Fenglong .
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, :715-724
[6]  
Chen H. -Y., 2020, INT C LEARN REPR
[7]  
Chen HY, 2022, 11 INT C LEARN REPR
[8]   Randaugment: Practical automated data augmentation with a reduced search space [J].
Cubuk, Ekin D. ;
Zoph, Barret ;
Shlens, Jonathon ;
Le, Quoc, V .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, :3008-3017
[9]  
Diao Enmao, 2022, Advances in Neural Information Processing Systems
[10]   FedSULP: A communication-efficient federated learning framework with selective updating and loss penalization [J].
Ebenezer, Nanor ;
Mawuli, B. Cobbinah ;
Yang, Qinli ;
Shao, Junming ;
Christiana, Kobiah .
INFORMATION SCIENCES, 2023, 651