Reinforcement learning with augmented states in partially expectation and action observable environment

被引:0
作者
Guirnaldo, SA [1 ]
Watanabe, K [1 ]
Izumi, K [1 ]
Kiguchi, K [1 ]
机构
[1] Saga Univ, Fac Engn Syst & Technol, Grad Sch Sci & Engn, Saga 8408502, Japan
来源
SICE 2002: PROCEEDINGS OF THE 41ST SICE ANNUAL CONFERENCE, VOLS 1-5 | 2002年
关键词
partially observable Markov decision processes; expectation; reinforcement learning; perception; perceptual aliasing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of developing good or optimal policies for partially observable Markov decision processes (POMDP) remains one of the most alluring areas of research in artificial intelligence. Encourage by the way how we (humans) form expectations from past experiences and how our decisions and behaviour are affected with our expectations, this paper proposes a method called expectation and action augmented states (EAAS) in reinforcement learning aimed to discover good or near optimal policies in partially observable environment. The method uses the concept of expectation to give distinction between aliased states. It works by augmenting the agent's observation with its expectation of that observation. Two problems from the literature were used to test the proposed method. The results show promising characteristics of the method as compared to some methods currently being used in this domain.
引用
收藏
页码:823 / 828
页数:6
相关论文
共 50 条
[41]   Novelty detection improves performance of reinforcement learners in fluctuating, partially observable environments [J].
Marzen, Sarah E. .
JOURNAL OF THEORETICAL BIOLOGY, 2019, 477 :44-50
[42]   Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent States [J].
Dong, Shi ;
Van Roy, Benjamin ;
Zhou, Zhengyuan .
Journal of Machine Learning Research, 2022, 23
[43]   Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent States [J].
Dong, Shi ;
Van Roy, Benjamin ;
Zhou, Zhengyuan .
JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
[44]   Learning optimal admission control in partially observable queueing networks [J].
Anselmi, Jonatha ;
Gaujal, Bruno ;
Rebuffi, Louis-Sebastien .
QUEUEING SYSTEMS, 2024, 108 (1-2) :31-79
[45]   ROBOTIC OBSTACLE AVOIDANCE IN A PARTIALLY OBSERVABLE ENVIRONMENT USING FEATURE RANKING [J].
Gharbieh, Waseem ;
Al-Mousa, Amjed .
INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2019, 34 (05) :572-579
[46]   Planning and Learning in Partially Observable Systems via Filter Stability [J].
Golowich, Noah ;
Moitra, Ankur ;
Rohatgi, Dhruv .
PROCEEDINGS OF THE 55TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING, STOC 2023, 2023, :349-362
[47]   Deadline-Aware Task Offloading With Partially-Observable Deep Reinforcement Learning for Multi-Access Edge Computing [J].
Huang, Hui ;
Ye, Qiang ;
Zhou, Yitong .
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2022, 9 (06) :3870-3885
[48]   Learning state-action correspondence across reinforcement learning control tasks via partially paired trajectories [J].
Garcia, Javier ;
Rano, Inaki ;
Bures, J. Miguel ;
Fdez-Vidal, Xose R. ;
Iglesias, Roberto .
APPLIED INTELLIGENCE, 2025, 55 (03)
[49]   Reinforcement Learning in the Environment Where Optimal Action Value Function Is Partly Discontinuous [J].
Shibusawa, Shingo ;
Shibuya, Takeshi .
2016 55TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2016, :1545-1550
[50]   Monotonicity properties for two-action partially observable Markov decision processes on partially ordered spaces [J].
Miehling, Erik ;
Teneketzis, Demosthenis .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2020, 282 (03) :936-944