Federated learning (FL) is a privacy-preserving distributed machine learning paradigm that enables collaborative training among geographically distributed and heterogeneous devices without gathering their data. Extending FL beyond the supervised learning models, federated reinforcement learning (FRL) was proposed to handle sequential decision-making problems in edge computing systems. However, the existing FRL algorithms directly combine model-free RL with FL, thus often leading to high sample complexity and lacking theoretical guarantees. To address the challenges, we propose a novel FRL algorithm that effectively incorporates model-based RL and ensemble knowledge distillation into FL for the first time. Specifically, we utilise FL and knowledge distillation to create an ensemble of dynamics models for clients, and then train the policy by solely using the ensemble model without interacting with the environment. Furthermore, we theoretically prove that the monotonic improvement of the proposed algorithm is guaranteed. The extensive experimental results demonstrate that our algorithm obtains much higher sample efficiency compared to classic model-free FRL algorithms in the challenging continuous control benchmark environments under edge computing settings. The results also highlight the significant impact of heterogeneous client data and local model update steps on the performance of FRL, validating the insights obtained from our theoretical analysis.
机构:
Jiangnan Univ, Sch Internet Things Engn, Wuxi 214122, Peoples R China
Xidian Univ, State Key Lab Integrated Serv Networks, Xian 710071, Peoples R ChinaJiangnan Univ, Sch Internet Things Engn, Wuxi 214122, Peoples R China
Wu, Qiong
Zhao, Yu
论文数: 0引用数: 0
h-index: 0
机构:
Jiangnan Univ, Sch Internet Things Engn, Wuxi 214122, Peoples R China
Xidian Univ, State Key Lab Integrated Serv Networks, Xian 710071, Peoples R ChinaJiangnan Univ, Sch Internet Things Engn, Wuxi 214122, Peoples R China
Zhao, Yu
Fan, Qiang
论文数: 0引用数: 0
h-index: 0
机构:
Qualcomm, San Jose, CA 95110 USAJiangnan Univ, Sch Internet Things Engn, Wuxi 214122, Peoples R China
Fan, Qiang
Fan, Pingyi
论文数: 0引用数: 0
h-index: 0
机构:
Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol, Beijing 100084, Peoples R ChinaJiangnan Univ, Sch Internet Things Engn, Wuxi 214122, Peoples R China
Fan, Pingyi
Wang, Jiangzhou
论文数: 0引用数: 0
h-index: 0
机构:
Univ Kent, Sch Engn, Canterbury CT2 7NT, EnglandJiangnan Univ, Sch Internet Things Engn, Wuxi 214122, Peoples R China
Wang, Jiangzhou
Zhang, Cui
论文数: 0引用数: 0
h-index: 0
机构:
Banma Network Technol Co Ltd, Shanghai 200000, Peoples R ChinaJiangnan Univ, Sch Internet Things Engn, Wuxi 214122, Peoples R China
机构:
Beijing Univ Posts & Telecommun, Sch Cyberspace Secur, Beijing 100876, Peoples R ChinaBeijing Univ Posts & Telecommun, Sch Cyberspace Secur, Beijing 100876, Peoples R China
Lu, Xiaofeng
Liao, Yuying
论文数: 0引用数: 0
h-index: 0
机构:
Beijing Univ Posts & Telecommun, Sch Cyberspace Secur, Beijing 100876, Peoples R ChinaBeijing Univ Posts & Telecommun, Sch Cyberspace Secur, Beijing 100876, Peoples R China
Liao, Yuying
论文数: 引用数:
h-index:
机构:
Lio, Pietro
Hui, Pan
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R ChinaBeijing Univ Posts & Telecommun, Sch Cyberspace Secur, Beijing 100876, Peoples R China
机构:
China Mobile Res Inst, Beijing 100053, Peoples R ChinaChina Mobile Res Inst, Beijing 100053, Peoples R China
Wu, Tingting
Li, Xiao
论文数: 0引用数: 0
h-index: 0
机构:
China Mobile Res Inst, Beijing 100053, Peoples R ChinaChina Mobile Res Inst, Beijing 100053, Peoples R China
Li, Xiao
Gao, Pengpei
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Acad Sci, Shenyang Inst Automat, State Key Lab Robot, Shenyang 110016, Peoples R China
Chinese Acad Sci, Shenyang Inst Automat, Key Lab Networked Control Syst, Shenyang 110016, Peoples R China
Univ Chinese Acad Sci, Beijing 100049, Peoples R ChinaChina Mobile Res Inst, Beijing 100053, Peoples R China
Gao, Pengpei
Yu, Wei
论文数: 0引用数: 0
h-index: 0
机构:
China Mobile Res Inst, Beijing 100053, Peoples R ChinaChina Mobile Res Inst, Beijing 100053, Peoples R China
Yu, Wei
Xin, Lun
论文数: 0引用数: 0
h-index: 0
机构:
China Mobile Res Inst, Beijing 100053, Peoples R ChinaChina Mobile Res Inst, Beijing 100053, Peoples R China
Xin, Lun
Guo, Manxue
论文数: 0引用数: 0
h-index: 0
机构:
China Mobile Res Inst, Beijing 100053, Peoples R ChinaChina Mobile Res Inst, Beijing 100053, Peoples R China
机构:
Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518172, Peoples R China
Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen, Peoples R ChinaChinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518172, Peoples R China
Fan, Sizheng
Zhang, Hongbo
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518172, Peoples R China
Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen, Peoples R ChinaChinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518172, Peoples R China
Zhang, Hongbo
Zeng, Yuchen
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518172, Peoples R China
Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen, Peoples R ChinaChinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518172, Peoples R China
Zeng, Yuchen
Cai, Wei
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518172, Peoples R China
Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen, Peoples R ChinaChinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518172, Peoples R China