Optimistic Planning for Belief-Augmented Markov Decision Processes

被引:0
|
作者
Fonteneau, Raphael [1 ,2 ]
Busoniu, Lucian [3 ,4 ]
Munos, Remi [2 ]
机构
[1] Univ Liege, Dept Elect Engn & Comp Sci, B-4000 Liege, Belgium
[2] Inria Lille Nord Europe, Team SequeL, Lille, France
[3] Univ Lorraine, CRAN, UMR 7039, Nancy, France
[4] CNRS, CRAN, UMR 7039, Nancy, France
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper presents the Bayesian Optimistic Planning (BOP) algorithm, a novel model-based Bayesian reinforcement learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Processes (OP-MDP) algorithm [10], [9] to contexts where the transition model of the MDP is initially unknown and progressively learned through interactions within the environment. The knowledge about the unknown MDP is represented with a probability distribution over all possible transition models using Dirichlet distributions, and the BOP algorithm plans in the belief-augmented state space constructed by concatenating the original state vector with the current posterior distribution over transition models. We show that BOP becomes Bayesian optimal when the budget parameter increases to infinity. Preliminary empirical validations show promising performance.
引用
收藏
页码:77 / 84
页数:8
相关论文
共 50 条
  • [1] Optimistic planning in Markov decision processes using a generative model
    Szorenyi, Balazs
    Kedenburg, Gunnar
    Munos, Remi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [2] Belief-augmented OWL (BOWL) - Engineering the Semantic Web with beliefs
    Feng, Yuzhang
    Li, Yuan Fang
    Tan, Colin Keng-Yan
    Wadhwa, Bimlesh
    Wang, Hai
    12TH IEEE INTERNATIONAL CONFERENCE ON ENGINEERING COMPLEX COMPUTER SYSTEMS, PROCEEDINGS, 2007, : 165 - +
  • [3] Planning with Abstract Markov Decision Processes
    Gopalan, Nakul
    desJardins, Marie
    Littman, Michael L.
    MacGlashan, James
    Squire, Shawn
    Tellex, Stefanie
    Winder, John
    Wong, Lawson L. S.
    TWENTY-SEVENTH INTERNATIONAL CONFERENCE ON AUTOMATED PLANNING AND SCHEDULING, 2017, : 480 - 488
  • [4] Preference Planning for Markov Decision Processes
    Li, Meilun
    She, Zhikun
    Turrini, Andrea
    Zhang, Lijun
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3313 - 3319
  • [5] Multiagent, Multitarget Path Planning in Markov Decision Processes
    Nawaz, Farhad
    Ornik, Melkior
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (12) : 7560 - 7574
  • [6] Approximate planning and verification for large Markov decision processes
    Lassaigne, Richard
    Peyronnet, Sylvain
    INTERNATIONAL JOURNAL ON SOFTWARE TOOLS FOR TECHNOLOGY TRANSFER, 2015, 17 (04) : 457 - 467
  • [7] Oblivious Markov Decision Processes: Planning and Policy Execution
    Alsayegh, Murtadha
    Fuentes, Jose
    Bobadilla, Leonardo
    Shell, Dylan A.
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 3850 - 3857
  • [8] Planning using hierarchical constrained Markov decision processes
    Seyedshams Feyzabadi
    Stefano Carpin
    Autonomous Robots, 2017, 41 : 1589 - 1607
  • [9] Planning using hierarchical constrained Markov decision processes
    Feyzabadi, Seyedshams
    Carpin, Stefano
    AUTONOMOUS ROBOTS, 2017, 41 (08) : 1589 - 1607
  • [10] Probabilistic Preference Planning Problem for Markov Decision Processes
    Li, Meilun
    Turrini, Andrea
    Hahn, Ernst Moritz
    She, Zhikun
    Zhang, Lijun
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (05) : 1545 - 1559