A temporal-difference learning method using gaussian state representation for continuous state space problems

被引:0
|
作者
机构
[1] Graduate School of Engineering, Osaka City University
来源
| 1600年 / Japanese Society for Artificial Intelligence卷 / 29期
关键词
Continuous state spaces; Gaussian state representation; Reinforcement learning; TD learning;
D O I
10.1527/tjsai.29.157
中图分类号
学科分类号
摘要
In this paper, we tackle the problem of reinforcement learning (RL) in a continuous state space. An appropriate discretization of the space can make many learning tasks tractable. A method using Gaussian state representation and the Rational Policy Making algorithm (RPM) has been proposed for this problem. This method discretizes the space by constructing a chain of states which represents a path to the goal of the agent exploiting past experiences of reaching it. This method exploits successful experiences strongly. Therefore, it can find a rational solution quickly in an environment with few noises. In a noisy environment, it makes many unnecessary and distractive states and does the task poorly. For learning in such an environment, we have introduced the concept of the value of a state to the above method and developed a new method. This method uses a temporal-difference (TD) learning algorithm for learning the values of states. The value of a state is used to determine the size of the state. Thus, our developed method can trim and eliminate unnecessary and distractive states quickly and learn the task well even in a noisy environment. We show the effectiveness of our method by computer simulations of a path finding task and a cart-pole swing-up task. © The Japanese Society for Artificial Intelligence 2014.
引用
收藏
页码:157 / 167
页数:10
相关论文
共 50 条
  • [41] A Method of Role Differentiation Using a State Space Filter with a Waveform Changing Parameter in Multi-agent Reinforcement Learning
    Nagayoshi, Masato
    Elderton, Simon
    Tamaki, Hisashi
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB 2021), 2021, : 461 - 464
  • [42] Multi-objective fuzzy Q-learning to solve continuous state-action problems
    Asgharnia, Amirhossein
    Schwartz, Howard
    Atia, Mohamed
    NEUROCOMPUTING, 2023, 516 : 115 - 132
  • [43] Visualization of Learning Process in "State and Action" Space Using Self-Organizing Maps
    Notsu, Akira
    Hattori, Yuichi
    Ubukata, Seiki
    Honda, Katsuhiro
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2016, 20 (06) : 983 - 991
  • [44] Topological Visualization Method for Understanding the Landscape of Value Functions and Structure of the State Space in Reinforcement Learning
    Nakamura, Yuki
    Shibuya, Takeshi
    ICAART: PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2020, : 370 - 377
  • [45] Adaptive internal state space construction method for Reinforcement learning of a real-world agent
    Samejima, K
    Omori, T
    NEURAL NETWORKS, 1999, 12 (7-8) : 1143 - 1155
  • [46] Generating Memoryless Policies Faster using Automatic Temporal Abstractions for Reinforcement Learning with Hidden State
    Cilden, Erkin
    Polat, Faruk
    2013 IEEE 25TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2013, : 719 - 726
  • [47] A state-space representation model and learning algorithm for real-time decision-making under uncertainty
    Malikopoulos, Andreas A.
    Assanis, Dennis N.
    Papalambros, Panos Y.
    PROCEEDINGS OF THE ASME INTERNATIONAL MECHANICAL ENGINERING CONGRESS AND EXPOSITION 2007, VOL 9, PTS A-C: MECHANICAL SYSTEMS AND CONTROL, 2008, : 575 - 584
  • [48] Expression of Continuous State and Action Spaces for Q-Learning Using Neural Networks and CMAC
    Yamada, Kazuaki
    JOURNAL OF ROBOTICS AND MECHATRONICS, 2012, 24 (02) : 330 - 339
  • [49] Reinforcement Learning based on State Space Model using Growing Neural Gas for a Mobile Robot
    Arai, Tomoyuki
    Toda, Yuichiro
    Iwasa, Mutsumi
    Shao, Shuai
    Tonomura, Ryuta
    Kubota, Naoyuki
    2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 1410 - 1413
  • [50] Intelligent Traffic Signal Duration Adaptation using Q-Learning with An Evolving State Space
    Gaikwad, Vinayak V.
    Kadarkar, Sanket S.
    Kasbekar, Gaurav S.
    2016 IEEE 84TH VEHICULAR TECHNOLOGY CONFERENCE (VTC FALL), 2016,