Model-based Reinforcement Learning and the Eluder Dimension

被引：0

作者：

Osband, Ian ^{[1
]}

Van Roy, Benjamin ^{[1
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014) | 2014年 / 27卷

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider the problem of learning to optimize an unknown Markov decision process (MDP). We show that, if the MDP can be parameterized within some known function class, we can obtain regret bounds that scale with the dimensionality, rather than cardinality, of the system. We characterize this dependence explicitly as (O) over tilde(root d(K)d(E)T) where T is time elapsed, d(K) is the Kolmogorov dimension and d(E) is the eluder dimension. These represent the first unified regret bounds for model-based reinforcement learning and provide state of the art guarantees in several important settings. More-over, we present a simple and computationally efficient algorithm posterior sampling for reinforcement learning (PSRL) that satisfies these bounds.

引用

页数：9

共 50 条

[1] Model-based reinforcement learning with dimension reduction
Tangkaratt, Voot
Morimoto, Jun
Sugiyama, Masashi
NEURAL NETWORKS, 2016, 84 : 1 - 16
[2] Uniform-PAC Guarantees for Model-Based RL with Bounded Eluder Dimension
Wu, Yue
He, Jiafan
Gu, Quanquan
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 2304 - 2313
[3] Model-based Reinforcement Learning: A Survey
Moerland, Thomas M.
Broekens, Joost
Plaat, Aske
Jonker, Catholijn M.
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2023, 16 (01): : 1 - 118
[4] A survey on model-based reinforcement learning
Fan-Ming LUO
Tian XU
Hang LAI
Xiong-Hui CHEN
Weinan ZHANG
Yang YU
Science China(Information Sciences), 2024, 67 (02) : 59 - 84
[5] Nonparametric model-based reinforcement learning
Atkeson, CG
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 10, 1998, 10 : 1008 - 1014
[6] The ubiquity of model-based reinforcement learning
Doll, Bradley B.
Simon, Dylan A.
Daw, Nathaniel D.
CURRENT OPINION IN NEUROBIOLOGY, 2012, 22 (06) : 1075 - 1081
[7] Multiple model-based reinforcement learning
Doya, K
Samejima, K
Katagiri, K
Kawato, M
NEURAL COMPUTATION, 2002, 14 (06) : 1347 - 1369
[8] A survey on model-based reinforcement learning
Luo, Fan-Ming
Xu, Tian
Lai, Hang
Chen, Xiong-Hui
Zhang, Weinan
Yu, Yang
SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (02)
[9] Learning to Paint With Model-based Deep Reinforcement Learning
Huang, Zhewei
Heng, Wen
Zhou, Shuchang
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8708 - 8717
[10] Incremental model-based reinforcement learning with model constraint
Yang, Zhiyou
Fu, Mingsheng
Qu, Hong
Li, Fan
Shi, Shuqing
Hu, Wang
NEURAL NETWORKS, 2025, 185

← 1 2 3 4 5 →