Reinforcement learning with augmented states in partially expectation and action observable environment

被引：0

作者：

Guirnaldo, SA ^{[1
]}

Watanabe, K ^{[1
]}

Izumi, K ^{[1
]}

Kiguchi, K ^{[1
]}

机构：

[1] Saga Univ, Fac Engn Syst & Technol, Grad Sch Sci & Engn, Saga 8408502, Japan

来源：

SICE 2002: PROCEEDINGS OF THE 41ST SICE ANNUAL CONFERENCE, VOLS 1-5 | 2002年

关键词：

partially observable Markov decision processes; expectation; reinforcement learning; perception; perceptual aliasing;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The problem of developing good or optimal policies for partially observable Markov decision processes (POMDP) remains one of the most alluring areas of research in artificial intelligence. Encourage by the way how we (humans) form expectations from past experiences and how our decisions and behaviour are affected with our expectations, this paper proposes a method called expectation and action augmented states (EAAS) in reinforcement learning aimed to discover good or near optimal policies in partially observable environment. The method uses the concept of expectation to give distinction between aliased states. It works by augmenting the agent's observation with its expectation of that observation. Two problems from the literature were used to test the proposed method. The results show promising characteristics of the method as compared to some methods currently being used in this domain.

引用

页码：823 / 828

页数：6

共 50 条

[1] Partially observable environment estimation with uplift inference for reinforcement learning based recommendation
Shang, Wenjie
Li, Qingyang
Qin, Zhiwei
Yu, Yang
Meng, Yiping
Ye, Jieping
MACHINE LEARNING, 2021, 110 (09) : 2603 - 2640
[2] Partially observable environment estimation with uplift inference for reinforcement learning based recommendation
Wenjie Shang
Qingyang Li
Zhiwei Qin
Yang Yu
Yiping Meng
Jieping Ye
Machine Learning, 2021, 110 : 2603 - 2640
[3] Hierarchical Deep Reinforcement Learning for Multi-robot Cooperation in Partially Observable Environment
Liang, Zhixuan
Cao, Jiannong
Lin, Wanyu
Chen, Jinlin
Xu, Huafeng
2021 IEEE THIRD INTERNATIONAL CONFERENCE ON COGNITIVE MACHINE INTELLIGENCE (COGMI 2021), 2021, : 272 - 281
[4] Learning reward machines: A study in partially observable reinforcement learning
Icarte, Rodrigo Toro
Klassen, Toryn Q.
Valenzano, Richard
Castro, Margarita P.
Waldie, Ethan
Mcilraith, Sheila A.
ARTIFICIAL INTELLIGENCE, 2023, 323
[5] Learning partially observable deterministic action models
Amir, Eyal
Chang, Allen
Journal of Artificial Intelligence Research, 2008, 33 : 349 - 402
[6] Multi-task Reinforcement Learning in Partially Observable Stochastic Environments
Li, Hui
Liao, Xuejun
Carin, Lawrence
JOURNAL OF MACHINE LEARNING RESEARCH, 2009, 10 : 1131 - 1186
[7] Partially Observable Reinforcement Learning for Sustainable Active Surveillance
Chen, Hechang
Yang, Bo
Liu, Jiming
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2018, PT II, 2018, 11062 : 425 - 437
[8] Bayesian Nonparametric Methods for Partially-Observable Reinforcement Learning
Doshi-Velez, Finale
Pfau, David
Wood, Frank
Roy, Nicholas
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (02) : 394 - 407
[9] PALO bounds for reinforcement learning in partially observable stochastic games
Ceren, Roi
He, Keyang
Doshi, Prashant
Banerjee, Bikramjit
NEUROCOMPUTING, 2021, 420 : 36 - 56
[10] Reinforcement Learning of Chaotic Systems Control in Partially Observable Environments
Weissenbacher, Max
Borovykh, Anastasia
Rigas, Georgios
FLOW TURBULENCE AND COMBUSTION, 2025,

← 1 2 3 4 5 →