Unsupervised Modeling of Partially Observable Environments

被引:0
|
作者
Graziano, Vincent [1 ]
Koutnik, Jan [1 ]
Schmidhuber, Juergen [1 ]
机构
[1] Univ Lugano, SUPSI, IDSIA, CH-6928 Manno, Switzerland
来源
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I | 2011年 / 6911卷
关键词
Self-Organizing Maps; POMDPs; Reinforcement Learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an architecture based on self-organizing maps for learning a sensory layer in a learning system. The architecture, temporal network for transitions (TNT), enjoys the freedoms of unsupervised learning, works on-line, in non-episodic environments, is computationally light, and scales well. TNT generates a predictive model of its internal representation of the world, making planning methods available for both the exploitation and exploration of the environment. Experiments demonstrate that TNT learns nice representations of classical reinforcement learning mazes of varying size (up to 20 x 20) under conditions of high-noise and stochastic actions.
引用
收藏
页码:503 / 515
页数:13
相关论文
共 50 条
  • [31] ROBOTIC OBSTACLE AVOIDANCE IN A PARTIALLY OBSERVABLE ENVIRONMENT USING FEATURE RANKING
    Gharbieh, Waseem
    Al-Mousa, Amjed
    INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2019, 34 (05) : 572 - 579
  • [32] PARTIALLY OBSERVABLE MODEL-BASED LEARNING FOR ISAC RESOURCE ALLOCATION
    Pulkkinee, Petteri
    Koivunen, Visa
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024), 2024, : 12996 - 13000
  • [33] Reinforcement learning with augmented states in partially expectation and action observable environment
    Guirnaldo, SA
    Watanabe, K
    Izumi, K
    Kiguchi, K
    SICE 2002: PROCEEDINGS OF THE 41ST SICE ANNUAL CONFERENCE, VOLS 1-5, 2002, : 823 - 828
  • [34] Partially Observable Reinforcement Learning for Dialog-based Interactive Recommendation
    Wu, Yaxiong
    Macdonald, Craig
    Ounis, Iadh
    15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021), 2021, : 241 - 251
  • [35] A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
    Ross, Stephane
    Pineau, Joelle
    Chaib-draa, Brahim
    Kreitmann, Pierre
    JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 1729 - 1770
  • [36] Learning Intelligent Behavior in a Non-stationary and Partially Observable Environment
    SelÇuk şenkul
    Faruk Polat
    Artificial Intelligence Review, 2002, 18 : 97 - 115
  • [37] Mobile sensor patrol path planning in partially observable border regions
    Pawgasame, Wichai
    Wipusitwarakun, Komwut
    APPLIED INTELLIGENCE, 2021, 51 (08) : 5453 - 5473
  • [38] Policy Reuse for Learning and Planning in Partially Observable Markov Decision Processes
    Wu, Bo
    Feng, Yanpeng
    2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE), 2017, : 549 - 552
  • [39] Learning intelligent behavior in a non-stationary and partially observable environment
    Senkul, S
    Polat, F
    ARTIFICIAL INTELLIGENCE REVIEW, 2002, 18 (02) : 97 - 115
  • [40] Planning in partially-observable switching-mode continuous domains
    Brunskill, Emma
    Kaelbling, Leslie Pack
    Lozano-Perez, Tomas
    Roy, Nicholas
    ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2010, 58 (3-4) : 185 - 216