Unsupervised Modeling of Partially Observable Environments

被引：0

作者：

Graziano, Vincent ^{[1
]}

Koutnik, Jan ^{[1
]}

Schmidhuber, Juergen ^{[1
]}

机构：

[1] Univ Lugano, SUPSI, IDSIA, CH-6928 Manno, Switzerland

来源：

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I | 2011年 / 6911卷

关键词：

Self-Organizing Maps; POMDPs; Reinforcement Learning;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present an architecture based on self-organizing maps for learning a sensory layer in a learning system. The architecture, temporal network for transitions (TNT), enjoys the freedoms of unsupervised learning, works on-line, in non-episodic environments, is computationally light, and scales well. TNT generates a predictive model of its internal representation of the world, making planning methods available for both the exploitation and exploration of the environment. Experiments demonstrate that TNT learns nice representations of classical reinforcement learning mazes of varying size (up to 20 x 20) under conditions of high-noise and stochastic actions.

引用

页码：503 / 515

页数：13

共 50 条

[31] ROBOTIC OBSTACLE AVOIDANCE IN A PARTIALLY OBSERVABLE ENVIRONMENT USING FEATURE RANKING
Gharbieh, Waseem
Al-Mousa, Amjed
INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2019, 34 (05) : 572 - 579
[32] PARTIALLY OBSERVABLE MODEL-BASED LEARNING FOR ISAC RESOURCE ALLOCATION
Pulkkinee, Petteri
Koivunen, Visa
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024), 2024, : 12996 - 13000
[33] Reinforcement learning with augmented states in partially expectation and action observable environment
Guirnaldo, SA
Watanabe, K
Izumi, K
Kiguchi, K
SICE 2002: PROCEEDINGS OF THE 41ST SICE ANNUAL CONFERENCE, VOLS 1-5, 2002, : 823 - 828
[34] Partially Observable Reinforcement Learning for Dialog-based Interactive Recommendation
Wu, Yaxiong
Macdonald, Craig
Ounis, Iadh
15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021), 2021, : 241 - 251
[35] A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
Ross, Stephane
Pineau, Joelle
Chaib-draa, Brahim
Kreitmann, Pierre
JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 1729 - 1770
[36] Learning Intelligent Behavior in a Non-stationary and Partially Observable Environment
SelÇuk şenkul
Faruk Polat
Artificial Intelligence Review, 2002, 18 : 97 - 115
[37] Mobile sensor patrol path planning in partially observable border regions
Pawgasame, Wichai
Wipusitwarakun, Komwut
APPLIED INTELLIGENCE, 2021, 51 (08) : 5453 - 5473
[38] Policy Reuse for Learning and Planning in Partially Observable Markov Decision Processes
Wu, Bo
Feng, Yanpeng
2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE), 2017, : 549 - 552
[39] Learning intelligent behavior in a non-stationary and partially observable environment
Senkul, S
Polat, F
ARTIFICIAL INTELLIGENCE REVIEW, 2002, 18 (02) : 97 - 115
[40] Planning in partially-observable switching-mode continuous domains
Brunskill, Emma
Kaelbling, Leslie Pack
Lozano-Perez, Tomas
Roy, Nicholas
ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2010, 58 (3-4) : 185 - 216

← 1 2 3 4 5 →