Federated Meta Reinforcement Learning for Personalized Tasks

被引:4
作者
Liu, Wentao [1 ]
Xu, Xiaolong [2 ]
Wu, Jintao [2 ]
Jiang, Jielin [2 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Comp Sci, Nanjing 210044, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, Sch Software, Nanjing 210044, Peoples R China
来源
TSINGHUA SCIENCE AND TECHNOLOGY | 2024年 / 29卷 / 03期
基金
中国国家自然科学基金;
关键词
Training; Metalearning; Sensitivity; Federated learning; Training data; Reinforcement learning; Data models; federated learning; reinforcement learning; meta-learning; personalization;
D O I
10.26599/TST.2023.9010066
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As an emerging privacy-preservation machine learning framework, Federated Learning (FL) facilitates different clients to train a shared model collaboratively through exchanging and aggregating model parameters while raw data are kept local and private. When this learning framework is applied to Deep Reinforcement Learning (DRL), the resultant Federated Reinforcement Learning (FRL) can circumvent the heavy data sampling required in conventional DRL and benefit from diversified training data, besides privacy preservation offered by FL. Existing FRL implementations presuppose that clients have compatible tasks which a single global model can cover. In practice, however, clients usually have incompatible (different but still similar) personalized tasks, which we called task shift. It may severely hinder the implementation of FRL for practical applications. In this paper, we propose a Federated Meta Reinforcement Learning (FMRL) framework by integrating Model-Agnostic Meta-Learning (MAML) and FRL. Specifically, we innovatively utilize Proximal Policy Optimization (PPO) to fulfil multi-step local training with a single round of sampling. Moreover, considering the sensitivity of learning rate selection in FRL, we reconstruct the aggregation optimizer with the Federated version of Adam (Fed-Adam) on the server side. The experiments demonstrate that, in different environments, FMRL outperforms other FL methods with high training efficiency brought by Fed-Adam.
引用
收藏
页码:911 / 926
页数:16
相关论文
共 48 条
  • [1] Acar D. A. E., 2021, 38 INT C MACH LEARN
  • [2] Demystifying Parallel and Distributed Deep Learning: An In-depth Concurrency Analysis
    Ben-Nun, Tal
    Hoefler, Torsten
    [J]. ACM COMPUTING SURVEYS, 2019, 52 (04)
  • [3] Chen Chia-Yu, 2020, Advances in Neural Information Processing Systems, V33
  • [4] Collins L, 2023, Arxiv, DOI [arXiv:2102.07078, DOI 10.48550/ARXIV.2102.07078]
  • [5] Deng YY, 2020, Arxiv, DOI arXiv:2003.13461
  • [6] Dinh CT, 2020, ADV NEUR IN, V33
  • [7] Fallah A, 2020, ADV NEUR IN, V33
  • [8] Fallah A, 2020, Arxiv, DOI arXiv:1908.10400
  • [9] Fan FX, 2022, Arxiv, DOI arXiv:2110.14074
  • [10] Finn C, 2017, PR MACH LEARN RES, V70