Optimistic Planning for Belief-Augmented Markov Decision Processes

被引:0
|
作者
Fonteneau, Raphael [1 ,2 ]
Busoniu, Lucian [3 ,4 ]
Munos, Remi [2 ]
机构
[1] Univ Liege, Dept Elect Engn & Comp Sci, B-4000 Liege, Belgium
[2] Inria Lille Nord Europe, Team SequeL, Lille, France
[3] Univ Lorraine, CRAN, UMR 7039, Nancy, France
[4] CNRS, CRAN, UMR 7039, Nancy, France
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper presents the Bayesian Optimistic Planning (BOP) algorithm, a novel model-based Bayesian reinforcement learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Processes (OP-MDP) algorithm [10], [9] to contexts where the transition model of the MDP is initially unknown and progressively learned through interactions within the environment. The knowledge about the unknown MDP is represented with a probability distribution over all possible transition models using Dirichlet distributions, and the BOP algorithm plans in the belief-augmented state space constructed by concatenating the original state vector with the current posterior distribution over transition models. We show that BOP becomes Bayesian optimal when the budget parameter increases to infinity. Preliminary empirical validations show promising performance.
引用
收藏
页码:77 / 84
页数:8
相关论文
共 50 条
  • [41] Multirobot Navigation Using Partially Observable Markov Decision Processes with Belief-Based Rewards
    Tzikas, Alexandros E.
    Knowles, Derek
    Gao, Grace X.
    Kochenderfer, Mykel J.
    JOURNAL OF AEROSPACE INFORMATION SYSTEMS, 2023, 20 (08): : 437 - 446
  • [42] FINITE STATE CONTINUOUS TIME MARKOV DECISION PROCESSES WITH A FINITE PLANNING HORIZON
    MILLER, BL
    SIAM JOURNAL ON CONTROL, 1968, 6 (02): : 266 - &
  • [43] Robust path planning for flexible needle insertion using Markov decision processes
    Xiaoyu Tan
    Pengqian Yu
    Kah-Bin Lim
    Chee-Kong Chui
    International Journal of Computer Assisted Radiology and Surgery, 2018, 13 : 1439 - 1451
  • [44] Planning treatment of ischemic heart disease with partially observable Markov decision processes
    Hauskrecht, M
    Fraser, H
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2000, 18 (03) : 221 - 244
  • [45] A Variational Perturbative Approach to Planning in Graph-Based Markov Decision Processes
    Linzner, Dominik
    Koeppl, Heinz
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7203 - 7210
  • [47] Robust path planning for flexible needle insertion using Markov decision processes
    Tan, Xiaoyu
    Yu, Pengqian
    Lim, Kah-Bin
    Chui, Chee-Kong
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2018, 13 (09) : 1439 - 1451
  • [48] Online Markov Decision Processes
    Even-Dar, Eyal
    Kakade, Sham M.
    Mansour, Yishay
    MATHEMATICS OF OPERATIONS RESEARCH, 2009, 34 (03) : 726 - 736
  • [49] MARKOV DECISION-PROCESSES
    SCHAL, M
    STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 1984, 17 (01) : 13 - 13
  • [50] A review on Markov Decision Processes
    J. A. Filar and LIU Ke Centre for Industrial and Applicable Mathematics
    Institute of Applied Mathematics
    Chinese Science Bulletin, 1999, (07) : 672 - 672