Cognitively inspired reinforcement learning architecture and its application to giant-swing motion control

被引:4
|
作者
Uragami, Daisuke [1 ]
Takahashi, Tatsuji [2 ]
Matsuo, Yoshiki [1 ]
机构
[1] Tokyo Univ Technol, Sch Comp Sci, Hachioji, Tokyo 1920982, Japan
[2] Tokyo Denki Univ, Sch Sci & Technol, Hiki, Saitama 3500394, Japan
基金
日本学术振兴会;
关键词
Q-learning; Exploration-exploitation dilemma; Bio-inspired computing; Cognitive bias; Loosely symmetric model; Acrobot; Multi-armed bandit problems; ACQUISITION; MODEL; BEHAVIOR; MAP;
D O I
10.1016/j.biosystems.2013.11.002
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Many algorithms and methods in artificial intelligence or machine learning were inspired by human cognition. As a mechanism to handle the exploration-exploitation dilemma in reinforcement learning, the loosely symmetric (LS) value function that models causal intuition of humans was proposed (Shinohara et al., 2007). While LS shows the highest correlation with causal induction by humans, it has been reported that it effectively works in multi-armed bandit problems that form the simplest class of tasks representing the dilemma. However, the scope of application of LS was limited to the reinforcement learning problems that have K actions with only one state (K-armed bandit problems). This study proposes LS-Q learning architecture that can deal with general reinforcement learning tasks with multiple states and delayed reward. We tested the learning performance of the new architecture in giant-swing robot motion learning, where uncertainty and unknown-ness of the environment is huge. In the test, the help of ready-made internal models or functional approximation of the state space were not given. The simulations showed that while the ordinary Q-learning agent does not reach giant-swing motion because of stagnant loops (local optima with low rewards), LS-Q escapes such loops and acquires giant-swing. It is confirmed that the smaller number of states is, in other words, the more coarse-grained the division of states and the more incomplete the state observation is, the better LS-Q performs in comparison with Q-learning. We also showed that the high performance of LS-Q depends comparatively little on parameter tuning and learning time. This suggests that the proposed method inspired by human cognition works adaptively in real environments. (C) 2013 Elsevier Ireland Ltd. All rights reserved.
引用
收藏
页码:1 / 9
页数:9
相关论文
共 50 条
  • [31] Motion Control of Autonomous Vehicle with Domain-Centralized Electronic and Electrical Architecture based on Predictive Reinforcement Learning Control Method
    Du, Guodong
    Zou, Yuan
    Zhang, Xudong
    Zhao, Kaiyu
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1409 - 1416
  • [32] Application of reinforcement learning in control system development
    Vichugov, VN
    Tsapko, GP
    Tsapko, SG
    Korus 2005, Proceedings, 2005, : 732 - 733
  • [33] Application of reinforcement learning in PMLM speed control
    Guo Hong-xia
    Wu Jie
    Liu Yong-qiang
    2007 IEEE INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION, VOLS 1-7, 2007, : 2411 - 2415
  • [34] Model predictive control with stage cost shaping inspired by reinforcement learning
    Beckenbach, Lukas
    Osinenko, Pavel
    Streif, Stefan
    2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 7110 - 7115
  • [35] A Reinforcement Learning Approach for Control of a Nature-Inspired Aerial Vehicle
    Sufiyan, Danial
    Win, Luke Thura Soe
    Win, Shane Kyi Hla
    Soh, Gim Song
    Foong, Shaohui
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 6030 - 6036
  • [36] Reference RL: Reinforcement learning with reference mechanism and its application in traffic signal control
    Lu, Yunxue
    Hegyi, Andreas
    Salomons, A. Maria
    Wang, Hao
    INFORMATION SCIENCES, 2025, 689
  • [37] Review of Deep Reinforcement Learning and Its Application in Modern Renewable Power System Control
    Li, Qingyan
    Lin, Tao
    Yu, Qianyi
    Du, Hui
    Li, Jun
    Fu, Xiyue
    ENERGIES, 2023, 16 (10)
  • [38] Adaptive Reinforcement Learning and Its Application to Robot Compliance Learning
    Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge
    MA
    02139, United States
    J. Rob. Mechatronics, 3 (250-262):
  • [39] Motion Control of a Snake Robot via Cerebellum-inspired Learning Control
    Ouyang, Wenjuan
    Li, Chenzui
    Liang, Wenyu
    Ren, Qinyuan
    Li, Ping
    2018 IEEE 14TH INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION (ICCA), 2018, : 1010 - 1015
  • [40] A Reinforcement Learning Modular Control Architecture for Fully Automated Vehicles
    Villagra, Jorge
    Milanes, Vicente
    Perez, Joshue
    Godoy, Jorge
    Onieva, Enrique
    Alonso, Javier
    Gonzalez, Carlos
    de Pedro, Teresa
    Garcia, Ricardo
    COMPUTER AIDED SYSTEMS THEORY - EUROCAST 2011, PT II, 2012, 6928 : 390 - 397