Unsupervised Modeling of Partially Observable Environments

被引：0

作者：

Graziano, Vincent ^{[1
]}

Koutnik, Jan ^{[1
]}

Schmidhuber, Juergen ^{[1
]}

机构：

[1] Univ Lugano, SUPSI, IDSIA, CH-6928 Manno, Switzerland

来源：

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I | 2011年 / 6911卷

关键词：

Self-Organizing Maps; POMDPs; Reinforcement Learning;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present an architecture based on self-organizing maps for learning a sensory layer in a learning system. The architecture, temporal network for transitions (TNT), enjoys the freedoms of unsupervised learning, works on-line, in non-episodic environments, is computationally light, and scales well. TNT generates a predictive model of its internal representation of the world, making planning methods available for both the exploitation and exploration of the environment. Experiments demonstrate that TNT learns nice representations of classical reinforcement learning mazes of varying size (up to 20 x 20) under conditions of high-noise and stochastic actions.

引用

页码：503 / 515

页数：13

共 50 条

[41] Fuzzy Reinforcement Learning Control for Decentralized Partially Observable Markov Decision Processes
Sharma, Rajneesh
Spaan, Matthijs T. J.
IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 1422 - 1429
[42] Deep Reinforcement Learning for Partially Observable Data Poisoning Attack in Crowdsensing Systems
Li, Mohan
Sun, Yanbin
Lu, Hui
Maharjan, Sabita
Tian, Zhihong
IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (07): : 6266 - 6278
[43] BOUNDED-PARAMETER PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES: FRAMEWORK AND ALGORITHM
Ni, Yaodong
Liu, Zhi-Qiang
INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2013, 21 (06) : 821 - 863
[44] Partially observable Markov decision process to generate policies in software defect management
Akbarinasaji, Shirin
Kavaklioglu, Can
Basar, Ayse
Neal, Adam
JOURNAL OF SYSTEMS AND SOFTWARE, 2020, 163
[45] Learning-based line impedance estimation for partially observable distribution systems
Zhu, Yanming
Xu, Xiaoyuan
Yan, Zheng
INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 137
[46] A nucleus for Bayesian Partially Observable Markov Games: Joint observer and mechanism design
Clempner, Julio B.
Poznyak, Alexander S.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 95
[47] Partially observable environment estimation with uplift inference for reinforcement learning based recommendation
Shang, Wenjie
Li, Qingyang
Qin, Zhiwei
Yu, Yang
Meng, Yiping
Ye, Jieping
MACHINE LEARNING, 2021, 110 (09) : 2603 - 2640
[48] Partially observable environment estimation with uplift inference for reinforcement learning based recommendation
Wenjie Shang
Qingyang Li
Zhiwei Qin
Yang Yu
Yiping Meng
Jieping Ye
Machine Learning, 2021, 110 : 2603 - 2640
[49] A Reinforcement Learning Scheme for a Partially-Observable Multi-Agent Game
Shin Ishii
Hajime Fujita
Masaoki Mitsutake
Tatsuya Yamazaki
Jun Matsuda
Yoichiro Matsuno
Machine Learning, 2005, 59 : 31 - 54
[50] A reinforcement learning scheme for a partially-observable multi-agent game
Ishii, S
Fujita, H
Mitsutake, M
Yamazaki, T
Matsuda, J
Matsuno, Y
MACHINE LEARNING, 2005, 59 (1-2) : 31 - 54

← 1 2 3 4 5 →