When Is Generalizable Reinforcement Learning Tractable?

被引:0
|
作者
Malik, Dhruv [1 ]
Li, Yuanzhi [1 ]
Ravikumar, Pradeep [1 ]
机构
[1] Carnegie Mellon Univ, Machine Learning Dept, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Agents trained by reinforcement learning (RL) often fail to generalize beyond the environment they were trained in, even when presented with new scenarios that seem similar to the training environment. We study the query complexity required to train RL agents that generalize to multiple environments. Intuitively, tractable generalization is only possible when the environments are similar or close in some sense. To capture this, we introduce Weak Proximity, a natural structural condition that requires the environments to have highly similar transition and reward functions and share a policy providing optimal value. Despite such shared structure, we prove that tractable generalization is impossible in the worst case. This holds even when each individual environment can be efficiently solved to obtain an optimal linear policy, and when the agent possesses a generative model. Our lower bound applies to the more complex task of representation learning for efficient generalization to multiple environments. On the positive side, we introduce Strong Proximity, a strengthened condition which we prove is sufficient for efficient generalization.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] When is Agnostic Reinforcement Learning Statistically Tractable?
    Jia, Zeyu
    Li, Gene
    Rakhlin, Alexander
    Sekhari, Ayush
    Srebro, Nathan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [2] Learning Generalizable and Composable Abstractions for Transfer in Reinforcement Learning
    Nayyar, Rashmeet Kaur
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23403 - 23404
  • [3] Learning Generalizable Locomotion Skills with Hierarchical Reinforcement Learning
    Li, Tianyu
    Lambert, Nathan
    Calandra, Roberto
    Meier, Franziska
    Rai, Akshara
    2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 413 - 419
  • [4] Towards Generalizable Reinforcement Learning for Trade Execution
    Zhang, Chuheng
    Duan, Yitong
    Chen, Xiaoyu
    Chen, Jianyu
    Li, Jian
    Zhao, Li
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 4975 - 4983
  • [5] Generalizable Episodic Memory for Deep Reinforcement Learning
    Hu, Hao
    Ye, Jianing
    Zhu, Guangxiang
    Ren, Zhizhou
    Zhang, Chongjie
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [6] Federated reinforcement learning for generalizable motion planning
    Yuan, Zhenyuan
    Xu, Siyuan
    Zhu, Minghui
    2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 78 - 83
  • [7] SLATEQ: A Tractable Decomposition for Reinforcement Learning with Recommendation Sets
    Ie, Eugene
    Jain, Vihan
    Wang, Jing
    Narvekar, Sanmit
    Agarwal, Ritesh
    Wu, Rui
    Cheng, Heng-Tze
    Chandra, Tushar
    Boutilier, Craig
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2592 - 2599
  • [8] Tractable large-scale deep reinforcement learning
    Sarang, Nima
    Poullis, Charalambos
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 232
  • [9] Tractable Reinforcement Learning of Signal Temporal Logic Objectives
    Venkataraman, Harish
    Aksaray, Derya
    Seiler, Peter
    LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 308 - 317
  • [10] Deep Reinforcement Learning for Generalizable Field Development Optimization
    He, Jincong
    Tang, Meng
    Hu, Chaoshun
    Tanaka, Shusei
    Wang, Kainan
    Wen, Xian-Huan
    Nasir, Yusuf
    SPE JOURNAL, 2022, 27 (01): : 226 - 245