Unsupervised Modeling of Partially Observable Environments

被引:0
|
作者
Graziano, Vincent [1 ]
Koutnik, Jan [1 ]
Schmidhuber, Juergen [1 ]
机构
[1] Univ Lugano, SUPSI, IDSIA, CH-6928 Manno, Switzerland
来源
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I | 2011年 / 6911卷
关键词
Self-Organizing Maps; POMDPs; Reinforcement Learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an architecture based on self-organizing maps for learning a sensory layer in a learning system. The architecture, temporal network for transitions (TNT), enjoys the freedoms of unsupervised learning, works on-line, in non-episodic environments, is computationally light, and scales well. TNT generates a predictive model of its internal representation of the world, making planning methods available for both the exploitation and exploration of the environment. Experiments demonstrate that TNT learns nice representations of classical reinforcement learning mazes of varying size (up to 20 x 20) under conditions of high-noise and stochastic actions.
引用
收藏
页码:503 / 515
页数:13
相关论文
共 50 条
  • [1] Spatial Consciousness Model of Intrinsic Reward in Partially Observable Environments
    Zhenghongyuan Ni
    Ye Jin
    Peng Liu
    Wei Zhao
    Journal of Intelligent & Robotic Systems, 2022, 106
  • [2] Spatial Consciousness Model of Intrinsic Reward in Partially Observable Environments
    Ni, Zhenghongyuan
    Jin, Ye
    Liu, Peng
    Zhao, Wei
    JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2022, 106 (04)
  • [3] Reinforcement Learning of Chaotic Systems Control in Partially Observable Environments
    Weissenbacher, Max
    Borovykh, Anastasia
    Rigas, Georgios
    FLOW TURBULENCE AND COMBUSTION, 2025,
  • [4] Multi-task Reinforcement Learning in Partially Observable Stochastic Environments
    Li, Hui
    Liao, Xuejun
    Carin, Lawrence
    JOURNAL OF MACHINE LEARNING RESEARCH, 2009, 10 : 1131 - 1186
  • [5] Novelty detection improves performance of reinforcement learners in fluctuating, partially observable environments
    Marzen, Sarah E.
    JOURNAL OF THEORETICAL BIOLOGY, 2019, 477 : 44 - 50
  • [6] Modeling and reinforcement learning in partially observable many-agent systems
    He, Keyang
    Doshi, Prashant
    Banerjee, Bikramjit
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2024, 38 (01)
  • [7] Q-Mixing Network for Multi-agent Pathfinding in Partially Observable Grid Environments
    Davydov, Vasilii
    Skrynnik, Alexey
    Yakovlev, Konstantin
    Panov, Aleksandr
    ARTIFICIAL INTELLIGENCE, RCAI 2021, 2021, 12948 : 169 - 179
  • [8] A gradient-based reinforcement learning approach to dynamic pricing in partially-observable environments
    Vengerov, David
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2008, 24 (07): : 687 - 693
  • [9] Modeling Partially Observable Systems using Graph-Based Memory and Topological Priors
    Morad, Steven D.
    Liwicki, Stephan
    Korvelesy, Ryan
    Mecca, Roberto
    Prorok, Amanda
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 168, 2022, 168
  • [10] Solving Partially Observable Environments with Universal Search Using Dataflow Graph-Based Programming Model
    Paul, Swarna Kamal
    Bhaumik, Parama
    IETE JOURNAL OF RESEARCH, 2023, 69 (09) : 6137 - 6151