Reinforcement learning with augmented states in partially expectation and action observable environment

被引:0
|
作者
Guirnaldo, SA [1 ]
Watanabe, K [1 ]
Izumi, K [1 ]
Kiguchi, K [1 ]
机构
[1] Saga Univ, Fac Engn Syst & Technol, Grad Sch Sci & Engn, Saga 8408502, Japan
来源
SICE 2002: PROCEEDINGS OF THE 41ST SICE ANNUAL CONFERENCE, VOLS 1-5 | 2002年
关键词
partially observable Markov decision processes; expectation; reinforcement learning; perception; perceptual aliasing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of developing good or optimal policies for partially observable Markov decision processes (POMDP) remains one of the most alluring areas of research in artificial intelligence. Encourage by the way how we (humans) form expectations from past experiences and how our decisions and behaviour are affected with our expectations, this paper proposes a method called expectation and action augmented states (EAAS) in reinforcement learning aimed to discover good or near optimal policies in partially observable environment. The method uses the concept of expectation to give distinction between aliased states. It works by augmenting the agent's observation with its expectation of that observation. Two problems from the literature were used to test the proposed method. The results show promising characteristics of the method as compared to some methods currently being used in this domain.
引用
收藏
页码:823 / 828
页数:6
相关论文
共 50 条
  • [11] Disturbance Observable Reinforcement Learning that Compensates for Changes in Environment
    Kim, SeongIn
    Shibuya, Takeshi
    2022 61ST ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS (SICE), 2022, : 141 - 145
  • [12] Partially Observable Reinforcement Learning for Dialog-based Interactive Recommendation
    Wu, Yaxiong
    Macdonald, Craig
    Ounis, Iadh
    15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021), 2021, : 241 - 251
  • [13] Modeling and reinforcement learning in partially observable many-agent systems
    He, Keyang
    Doshi, Prashant
    Banerjee, Bikramjit
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2024, 38 (01)
  • [14] A novel approach for self-driving car in partially observable environment using life long reinforcement learning
    Quadir, Md Abdul
    Jaiswal, Dibyanshu
    Mohan, Senthilkumar
    Innab, Nisreen
    Sulaiman, Riza
    Alaoui, Mohammed Kbiri
    Ahmadian, Ali
    SUSTAINABLE ENERGY GRIDS & NETWORKS, 2024, 38
  • [15] Fuzzy Reinforcement Learning Control for Decentralized Partially Observable Markov Decision Processes
    Sharma, Rajneesh
    Spaan, Matthijs T. J.
    IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 1422 - 1429
  • [16] Deep Reinforcement Learning for Partially Observable Data Poisoning Attack in Crowdsensing Systems
    Li, Mohan
    Sun, Yanbin
    Lu, Hui
    Maharjan, Sabita
    Tian, Zhihong
    IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (07): : 6266 - 6278
  • [17] A Reinforcement Learning Scheme for a Partially-Observable Multi-Agent Game
    Shin Ishii
    Hajime Fujita
    Masaoki Mitsutake
    Tatsuya Yamazaki
    Jun Matsuda
    Yoichiro Matsuno
    Machine Learning, 2005, 59 : 31 - 54
  • [18] A reinforcement learning scheme for a partially-observable multi-agent game
    Ishii, S
    Fujita, H
    Mitsutake, M
    Yamazaki, T
    Matsuda, J
    Matsuno, Y
    MACHINE LEARNING, 2005, 59 (1-2) : 31 - 54
  • [19] Learning Intelligent Behavior in a Non-stationary and Partially Observable Environment
    SelÇuk şenkul
    Faruk Polat
    Artificial Intelligence Review, 2002, 18 : 97 - 115
  • [20] Learning intelligent behavior in a non-stationary and partially observable environment
    Senkul, S
    Polat, F
    ARTIFICIAL INTELLIGENCE REVIEW, 2002, 18 (02) : 97 - 115