PAC-Bayesian offline Meta-reinforcement learning

被引:0
作者
Zheng Sun
Chenheng Jing
Shangqi Guo
Lingling An
机构
[1] Xidian University,Guangzhou Institute of Technology
[2] Tsinghua University,Department of Precision Instrument and Department of Automation
[3] Xidian University,School of Computer Science and Technology
来源
Applied Intelligence | 2023年 / 53卷
关键词
Meta-reinforcement learning; PAC-bayesian theory; Dependency graph; Generalization bounds;
D O I
暂无
中图分类号
学科分类号
摘要
Meta-reinforcement learning (Meta-RL) utilizes shared structure among tasks to enable rapid adaptation to new tasks with only a little experience. However, most existing Meta-RL algorithms lack theoretical generalization guarantees or offer such guarantees under restrictive assumptions (e.g., strong assumptions on the data distribution). This paper for the first time conducts a theoretical analysis for estimating the generalization performance of the Meta-RL learner using the PAC-Bayesian theory. The application of PAC-Bayesian theory to Meta-RL poses a challenge due to the existence of dependencies in the training data, which renders the independent and identically distributed (i.i.d.) assumption invalid. To address this challenge, we propose a dependency graph-based offline decomposition (DGOD) approach, which decomposes non-i.i.d. Meta-RL data into multiple offline i.i.d. datasets by utilizing the techniques of offline sampling and graph decomposition. With the DGOD approach, we derive the practical PAC-Bayesian offline Meta-RL generalization bounds and design an algorithm with generalization guarantees to optimize them, called PAC-Bayesian Offline Meta-Actor-Critic (PBOMAC). The results of experiments conducted on several challenging Meta-RL benchmarks demonstrate that our algorithm performs well in avoiding meta-overfitting and outperforms recent state-of-the-art Meta-RL algorithms without generalization bounds.
引用
收藏
页码:27128 / 27147
页数:19
相关论文
共 50 条
  • [11] Meta-Reinforcement Learning in Non-Stationary and Dynamic Environments
    Bing, Zhenshan
    Lerch, David
    Huang, Kai
    Knoll, Alois
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3476 - 3491
  • [12] Meta-Reinforcement Learning Algorithm Based on Reward and Dynamic Inference
    Chen, Jinhao
    Zhang, Chunhong
    Hu, Zheng
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT III, PAKDD 2024, 2024, 14647 : 223 - 234
  • [13] Human-Inspired Meta-Reinforcement Learning Using Bayesian Knowledge and Enhanced Deep Q-Network
    Ho, Joshua
    Wang, Chien-Min
    King, Chung-Ta
    You, Yi-Hsin
    Feng, Chi-Wei
    INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2024, 18 (04) : 547 - 569
  • [14] Image quality assessment for machine learning tasks using meta-reinforcement learning
    Saeed S.U.
    Fu Y.
    Stavrinides V.
    Baum Z.M.C.
    Yang Q.
    Rusu M.
    Fan R.E.
    Sonn G.A.
    Noble J.A.
    Barratt D.C.
    Hu Y.
    Medical Image Analysis, 2022, 78
  • [15] Meta-Reinforcement Learning for Centralized Multiple Intersections Traffic Signal Control
    Ren, Yanyu
    Wu, Jia
    Yi, Chenglin
    Ran, Yunchuan
    Lou, Yican
    2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2022, : 281 - 286
  • [16] A Meta-Reinforcement Learning-Based Poisoning Attack Framework Against Federated Learning
    Zhou, Wei
    Zhang, Donglai
    Wang, Hongjie
    Li, Jinliang
    Jiang, Mingjian
    IEEE ACCESS, 2025, 13 : 28628 - 28644
  • [17] Learning and Fast Adaptation for Air Combat Decision with Improved Deep Meta-reinforcement Learning
    Zhang, Pin
    Dong, Wenhan
    Cai, Ming
    Li, Dunwang
    Zhang, Xin
    INTERNATIONAL JOURNAL OF AERONAUTICAL AND SPACE SCIENCES, 2024,
  • [18] MetaABR: Environment-Adaptive Video Streaming System with Meta-Reinforcement Learning
    Choi, Wangyu
    Yoon, Jongwon
    PROCEEDINGS OF THE INTERNATIONAL CONEXT STUDENT WORKSHOP 2022, CONEXT-SW 2022, 2022, : 37 - 39
  • [19] Meta-Reinforcement Learning-Based Transferable Scheduling Strategy for Energy Management
    Xiong, Luolin
    Tang, Yang
    Liu, Chensheng
    Mao, Shuai
    Meng, Ke
    Dong, Zhaoyang
    Qian, Feng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (04) : 1685 - 1695
  • [20] Adaptable Image Quality Assessment Using Meta-Reinforcement Learning of Task Amenability
    Saeed, Shaheer U.
    Fu, Yunguan
    Stavrinides, Vasilis
    Baum, Zachary M. C.
    Yang, Qianye
    Rusu, Mirabela
    Fan, Richard E.
    Sonn, Geoffrey A.
    Noble, J. Alison
    Barratt, Dean C.
    Hu, Yipeng
    SIMPLIFYING MEDICAL ULTRASOUND, 2021, 12967 : 191 - 201