Model-based Reinforcement Learning and the Eluder Dimension

被引：0

作者：

Osband, Ian ^{[1
]}

Van Roy, Benjamin ^{[1
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014) | 2014年 / 27卷

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider the problem of learning to optimize an unknown Markov decision process (MDP). We show that, if the MDP can be parameterized within some known function class, we can obtain regret bounds that scale with the dimensionality, rather than cardinality, of the system. We characterize this dependence explicitly as (O) over tilde(root d(K)d(E)T) where T is time elapsed, d(K) is the Kolmogorov dimension and d(E) is the eluder dimension. These represent the first unified regret bounds for model-based reinforcement learning and provide state of the art guarantees in several important settings. More-over, we present a simple and computationally efficient algorithm posterior sampling for reinforcement learning (PSRL) that satisfies these bounds.

引用

页数：9

共 50 条

[21] A comparison of direct and model-based reinforcement learning
Atkeson, CG
Santamaria, JC
1997 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION - PROCEEDINGS, VOLS 1-4, 1997, : 3557 - 3564
[22] Model-based reinforcement learning in a complex domain
Kalyanakrishnan, Shivaram
Stone, Peter
Liu, Yaxin
ROBOCUP 2007: ROBOT SOCCER WORLD CUP XI, 2008, 5001 : 171 - 183
[23] Lipschitz Continuity in Model-based Reinforcement Learning
Asadi, Kavosh
Misra, Dipendra
Littman, Michael L.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[24] A Contraction Approach to Model-based Reinforcement Learning
Fan, Ting-Han
Ramadge, Peter J.
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130 : 325 - +
[25] Model-Based Reinforcement Learning For Robot Control
Li, Xiang
Shang, Weiwei
Cong, Shuang
2020 5TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2020), 2020, : 300 - 305
[26] Consistency of Fuzzy Model-Based Reinforcement Learning
Busoniu, Lucian
Ernst, Damien
De Schutter, Bart
Babuska, Robert
2008 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5, 2008, : 518 - +
[27] Abstraction Selection in Model-Based Reinforcement Learning
Jiang, Nan
Kulesza, Alex
Singh, Satinder
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 179 - 188
[28] Asynchronous Methods for Model-Based Reinforcement Learning
Zhang, Yunzhi
Clavera, Ignasi
Tsai, Boren
Abbeel, Pieter
CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
[29] Online Constrained Model-based Reinforcement Learning
van Niekerk, Benjamin
Damianou, Andreas
Rosman, Benjamin
CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
[30] Calibrated Model-Based Deep Reinforcement Learning
Malik, Ali
Kuleshov, Volodymyr
Song, Jiaming
Nemer, Danny
Seymour, Harlan
Ermon, Stefano
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97

← 1 2 3 4 5 →