Provable Benefit of Multitask Representation Learning in Reinforcement Learning

被引:0
|
作者
Cheng, Yuan [1 ]
Feng, Songtao [2 ]
Yang, Jing [3 ]
Zhang, Hong [1 ]
Liang, Yingbin [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Ohio State Univ, Columbus, OH USA
[3] Penn State Univ, University Pk, PA USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022 | 2022年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As representation learning becomes a powerful technique to reduce sample complexity in reinforcement learning (RL) in practice, theoretical understanding of its advantage is still limited. In this paper, we theoretically characterize the benefit of representation learning under the low-rank Markov decision process (MDP) model. We first study multitask low-rank RL (as upstream training), where all tasks share a common representation, and propose a new multitask reward-free algorithm called REFUEL. REFUEL learns both the transition kernel and the near-optimal policy for each task, and outputs a well-learned representation for downstream tasks. Our result demonstrates that multitask representation learning is provably more sample-efficient than learning each task individually, as long as the total number of tasks is above a certain threshold. We then study the downstream RL in both online and offline settings, where the agent is assigned with a new task sharing the same representation as the upstream tasks. For both online and offline settings, we develop a sample-efficient algorithm, and show that it finds a near-optimal policy with the suboptimality gap bounded by the sum of the estimation error of the learned representation in upstream and a vanishing term as the number of downstream samples becomes large. Our downstream results of online and offline RL further capture the benefit of employing the learned representation from upstream as opposed to learning the representation of the low-rank model directly. To the best of our knowledge, this is the first theoretical study that characterizes the benefit of representation learning in exploration-based reward-free multitask RL for both upstream and downstream tasks.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] The Benefit of Multitask Representation Learning
    Maurer, Andreas
    Pontil, Massimiliano
    Romera-Paredes, Bernardino
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [2] The benefit of multitask representation learning
    Maurer, Andreas
    Pontil, Massimiliano
    Romera-Paredes, Bernardino
    Journal of Machine Learning Research, 2016, 17
  • [3] Provable General Function Class Representation Learning in Multitask Bandits and MDPs
    Lu, Rui
    Zhao, Andrew
    Du, Simon S.
    Huang, Gao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [4] Multitask Learning for Object Localization With Deep Reinforcement Learning
    Wang, Yan
    Zhang, Lei
    Wang, Lituan
    Wang, Zizhou
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2019, 11 (04) : 573 - 580
  • [5] Multitask reinforcement learning on the distribution of MDPs
    Tanaka, F
    Yamamura, M
    2003 IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN ROBOTICS AND AUTOMATION, VOLS I-III, PROCEEDINGS, 2003, : 1108 - 1113
  • [6] Distral: Robust Multitask Reinforcement Learning
    Teh, Yee Whye
    Bapst, Victor
    Czarnecki, Wojciech Marian
    Quan, John
    Kirkpatrick, James
    Hadsell, Raia
    Heess, Nicolas
    Pascanu, Razvan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [7] Sharing Experience in Multitask Reinforcement Learning
    Tung-Long Vuong
    Do-Van Nguyen
    Tai-Long Nguyen
    Cong-Minh Bui
    Hai-Dang Kieu
    Viet-Cuong Ta
    Quoc-Long Tran
    Thanh-Ha Le
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3642 - 3648
  • [8] Reinforcement Learning Guided by Provable Normative Compliance
    Neufeld, Emery
    ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 3, 2022, : 444 - 453
  • [9] Multitask transfer learning with kernel representation
    Yulu Zhang
    Shihui Ying
    Zhijie Wen
    Neural Computing and Applications, 2022, 34 : 12709 - 12721
  • [10] Multitask transfer learning with kernel representation
    Zhang, Yulu
    Ying, Shihui
    Wen, Zhijie
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (15): : 12709 - 12721