Unsupervised Modeling of Partially Observable Environments

被引：0

作者：

Graziano, Vincent ^{[1
]}

Koutnik, Jan ^{[1
]}

Schmidhuber, Juergen ^{[1
]}

机构：

[1] Univ Lugano, SUPSI, IDSIA, CH-6928 Manno, Switzerland

来源：

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I | 2011年 / 6911卷

关键词：

Self-Organizing Maps; POMDPs; Reinforcement Learning;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present an architecture based on self-organizing maps for learning a sensory layer in a learning system. The architecture, temporal network for transitions (TNT), enjoys the freedoms of unsupervised learning, works on-line, in non-episodic environments, is computationally light, and scales well. TNT generates a predictive model of its internal representation of the world, making planning methods available for both the exploitation and exploration of the environment. Experiments demonstrate that TNT learns nice representations of classical reinforcement learning mazes of varying size (up to 20 x 20) under conditions of high-noise and stochastic actions.

引用

页码：503 / 515

页数：13

共 50 条

[1] Spatial Consciousness Model of Intrinsic Reward in Partially Observable Environments
Zhenghongyuan Ni
Ye Jin
Peng Liu
Wei Zhao
Journal of Intelligent & Robotic Systems, 2022, 106
[2] Spatial Consciousness Model of Intrinsic Reward in Partially Observable Environments
Ni, Zhenghongyuan
Jin, Ye
Liu, Peng
Zhao, Wei
JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2022, 106 (04)
[3] Reinforcement Learning of Chaotic Systems Control in Partially Observable Environments
Weissenbacher, Max
Borovykh, Anastasia
Rigas, Georgios
FLOW TURBULENCE AND COMBUSTION, 2025,
[4] Multi-task Reinforcement Learning in Partially Observable Stochastic Environments
Li, Hui
Liao, Xuejun
Carin, Lawrence
JOURNAL OF MACHINE LEARNING RESEARCH, 2009, 10 : 1131 - 1186
[5] Novelty detection improves performance of reinforcement learners in fluctuating, partially observable environments
Marzen, Sarah E.
JOURNAL OF THEORETICAL BIOLOGY, 2019, 477 : 44 - 50
[6] Modeling and reinforcement learning in partially observable many-agent systems
He, Keyang
Doshi, Prashant
Banerjee, Bikramjit
AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2024, 38 (01)
[7] Q-Mixing Network for Multi-agent Pathfinding in Partially Observable Grid Environments
Davydov, Vasilii
Skrynnik, Alexey
Yakovlev, Konstantin
Panov, Aleksandr
ARTIFICIAL INTELLIGENCE, RCAI 2021, 2021, 12948 : 169 - 179
[8] A gradient-based reinforcement learning approach to dynamic pricing in partially-observable environments
Vengerov, David
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2008, 24 (07): : 687 - 693
[9] Modeling Partially Observable Systems using Graph-Based Memory and Topological Priors
Morad, Steven D.
Liwicki, Stephan
Korvelesy, Ryan
Mecca, Roberto
Prorok, Amanda
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 168, 2022, 168
[10] Solving Partially Observable Environments with Universal Search Using Dataflow Graph-Based Programming Model
Paul, Swarna Kamal
Bhaumik, Parama
IETE JOURNAL OF RESEARCH, 2023, 69 (09) : 6137 - 6151

← 1 2 3 4 5 →