Intelligent Model Learning Based on Variance for Bayesian Reinforcement Learning

被引：0

作者：

You, Shuhua ^{[1
]}

Liu, Quan ^{[2
]}

Zhang, Zongzhang ^{[1
]}

Wang, Hui ^{[1
]}

Zhang, Xiaofang ^{[1
]}

机构：

[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China

[2] Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Seoul, South Korea

来源：

2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015) | 2015年

关键词：

reinforcement learning; Bayesian dynamic programming; model learning; policy learning; Dirichlet distributions;

D O I：

10.1109/ICTAI.2015.37

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider a modular method to reinforcement learning that represents uncertainty of model parameters by maintaining probability distributions over them. The algorithm we call MBDP (model-based Bayesian dynamic programming) can be decomposed into two parallel types of inference: model learning and policy learning. During learning a model, we update posterior distributions of a model over observations after taking an action in each state. During learning a policy, we solve MDPs by dynamic programming with greedy approximation to make an agent choose behaviors which maximize return under the estimated model. Furthermore, we propose a principled method which utilizes the variance of Dirichlet distributions for determining when to learn and relearn the model. We demonstrate that MBDP can find near optimal policies with high probability by sufficient model learning and experimental results show that MBDP performs better compared with current state-of-the-art methods in reinforcement learning.

引用

页码：170 / 177

页数：8

共 50 条

[1] A Motor Learning Neural Model based on Bayesian Network and Reinforcement Learning
Hosoya, Haruo
IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 760 - 767
[2] A Model-based Factored Bayesian Reinforcement Learning Approach
Wu, Bo
Feng, Yanpeng
Zheng, Hongyan
APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 1092 - 1095
[3] Smarter Sampling in Model-Based Bayesian Reinforcement Learning
Castro, Pablo Samuel
Precup, Doina
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I: EUROPEAN CONFERENCE, ECML PKDD 2010, 2010, 6321 : 200 - 214
[4] Model-based Bayesian Reinforcement Learning for Dialogue Management
Lison, Pierre
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 475 - 479
[5] Model-based Lifelong Reinforcement Learning with Bayesian Exploration
Fu, Haotian
Yu, Shangqun
Littman, Michael
Konidaris, George
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[6] Reward Shaping for Model-Based Bayesian Reinforcement Learning
Kim, Hyeoneun
Lim, Woosang
Lee, Kanghoon
Noh, Yung-Kyun
Kim, Kee-Eung
PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3548 - 3555
[7] Variational Inference MPC for Bayesian Model-based Reinforcement Learning
Okada, Masashi
Taniguchi, Tadahiro
CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
[8] Bayesian Model-Based Offline Reinforcement Learning for Product Allocation
Jenkins, Porter
Wei, Hua
Jenkins, J. Stockton
Li, Zhenhui
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12531 - 12537
[9] Robust and Explorative Behavior in Model-based Bayesian Reinforcement Learning
Hishinuma, Toru
Senda, Kei
PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
[10] Shaping Bayesian Network Based Reinforcement Learning
Song, Jiong
Jin, Zhao
2012 INTERNATIONAL CONFERENCE ON INDUSTRIAL CONTROL AND ELECTRONICS ENGINEERING (ICICEE), 2012, : 742 - 745

← 1 2 3 4 5 →