Intelligent Model Learning Based on Variance for Bayesian Reinforcement Learning

被引:0
|
作者
You, Shuhua [1 ]
Liu, Quan [2 ]
Zhang, Zongzhang [1 ]
Wang, Hui [1 ]
Zhang, Xiaofang [1 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China
[2] Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Seoul, South Korea
来源
2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015) | 2015年
关键词
reinforcement learning; Bayesian dynamic programming; model learning; policy learning; Dirichlet distributions;
D O I
10.1109/ICTAI.2015.37
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider a modular method to reinforcement learning that represents uncertainty of model parameters by maintaining probability distributions over them. The algorithm we call MBDP (model-based Bayesian dynamic programming) can be decomposed into two parallel types of inference: model learning and policy learning. During learning a model, we update posterior distributions of a model over observations after taking an action in each state. During learning a policy, we solve MDPs by dynamic programming with greedy approximation to make an agent choose behaviors which maximize return under the estimated model. Furthermore, we propose a principled method which utilizes the variance of Dirichlet distributions for determining when to learn and relearn the model. We demonstrate that MBDP can find near optimal policies with high probability by sufficient model learning and experimental results show that MBDP performs better compared with current state-of-the-art methods in reinforcement learning.
引用
收藏
页码:170 / 177
页数:8
相关论文
共 50 条
  • [1] A Motor Learning Neural Model based on Bayesian Network and Reinforcement Learning
    Hosoya, Haruo
    IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 760 - 767
  • [2] A Model-based Factored Bayesian Reinforcement Learning Approach
    Wu, Bo
    Feng, Yanpeng
    Zheng, Hongyan
    APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 1092 - 1095
  • [3] Smarter Sampling in Model-Based Bayesian Reinforcement Learning
    Castro, Pablo Samuel
    Precup, Doina
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I: EUROPEAN CONFERENCE, ECML PKDD 2010, 2010, 6321 : 200 - 214
  • [4] Model-based Bayesian Reinforcement Learning for Dialogue Management
    Lison, Pierre
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 475 - 479
  • [5] Model-based Lifelong Reinforcement Learning with Bayesian Exploration
    Fu, Haotian
    Yu, Shangqun
    Littman, Michael
    Konidaris, George
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [6] Reward Shaping for Model-Based Bayesian Reinforcement Learning
    Kim, Hyeoneun
    Lim, Woosang
    Lee, Kanghoon
    Noh, Yung-Kyun
    Kim, Kee-Eung
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3548 - 3555
  • [7] Variational Inference MPC for Bayesian Model-based Reinforcement Learning
    Okada, Masashi
    Taniguchi, Tadahiro
    CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [8] Bayesian Model-Based Offline Reinforcement Learning for Product Allocation
    Jenkins, Porter
    Wei, Hua
    Jenkins, J. Stockton
    Li, Zhenhui
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12531 - 12537
  • [9] Robust and Explorative Behavior in Model-based Bayesian Reinforcement Learning
    Hishinuma, Toru
    Senda, Kei
    PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
  • [10] Shaping Bayesian Network Based Reinforcement Learning
    Song, Jiong
    Jin, Zhao
    2012 INTERNATIONAL CONFERENCE ON INDUSTRIAL CONTROL AND ELECTRONICS ENGINEERING (ICICEE), 2012, : 742 - 745