Actor-Q based active perception learning system

被引:0
作者
Shibata, K [1 ]
Nishino, T [1 ]
Okabe, Y [1 ]
机构
[1] Oita Univ, Dept Elect & Elect Engn, Oita 8701192, Japan
来源
2001 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS I-IV, PROCEEDINGS | 2001年
关键词
Actor-Q architecture; reinforcement learning; neural network; active perception; visual sensor;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An active perception learning system based on reinforcement learning is proposed. A novel reinforcement architecture called Actor-Q is employed in which Q-learning and Actor-Critic are combined. The system decides its actions according to Q-values. One of the actions is to move its sensor and the others are to make all answer of its recognition result, each of which corresponds to each pattern. When the sensor motion is selected the sensor moves according to thc actor's output signals. The Q-value for the sensor motion is trained by Q-learning. and the Actor is trained hy the Q-value for the sensor motion on behalf of the critic When one of the other actions is selected the system outputs the recognition result. When the recognition answer is correct, the Q-value is trained to be the upper limit of the Q-value, and when the answer is not correct, it is trained to be 0.0. The module to compute Q-value and the actor module are both consisted of a neural network and are trained by Error Back Propagation. The training signals are generated based on the above reinforcement learning. It was confirmed by some simulations using a visual sensor with non-uniform visual cells that the system moves its sensor to the place where it can recognize the presented pattern correctly. Even though the Q-value surface as a function of the sensor location has some local peaks. the sensor was not trapped and moved to the appropriate direction because the Q-value for the sensor motion becomes larger.
引用
收藏
页码:1000 / 1005
页数:6
相关论文
共 50 条
  • [31] A active vibration control strategy based on reinforcement learning
    Zhou J.
    Dong L.
    Meng C.
    Sun H.
    Dong, Longlei, 1600, Chinese Vibration Engineering Society (40): : 281 - 286
  • [32] Dynamic Actor-critic: Reinforcement Learning based Radio Resource Scheduling For LTE-Advanced
    Tathe, Pallavi K.
    Sharma, Manish
    2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
  • [33] Manipulator Motion Planning based on Actor-Critic Reinforcement Learning
    Li, Qiang
    Nie, Jun
    Wang, Haixia
    Lu, Xiao
    Song, Shibin
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 4248 - 4254
  • [34] A Deep Q-learning based Path Planning and Navigation System for Firefighting Environments
    Bhattarai, Manish
    Martinez-Ramon, Manel
    ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 2021, : 267 - 277
  • [35] Load Frequency Active Disturbance Rejection Control for Multi-Source Power System Based on Soft Actor-Critic
    Zheng, Yuemin
    Tao, Jin
    Sun, Hao
    Sun, Qinglin
    Chen, Zengqiang
    Dehmer, Matthias
    Zhou, Quan
    ENERGIES, 2021, 14 (16)
  • [36] Q-learning system based on cooperative least squares support vector machine
    School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou 221116, China
    不详
    Zidonghua Xuebao Acta Auto. Sin., 2009, 2 (214-219): : 214 - 219
  • [37] A Distributed Topology Access Strategy Based on Q-learning in a WDM VLC System
    Wang, Liqiang
    Han, Dahai
    Zhang, Min
    Chen, Qiguan
    2022 ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE, ACP, 2022, : 495 - 497
  • [38] A Q-learning System for Container Marshalling with Group-Based Learning Model at Container Yard Terminals
    Hirashima, Yoichi
    IMECS 2009: INTERNATIONAL MULTI-CONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2009, : 162 - 167
  • [39] Evaluating Correctness of Reinforcement Learning based on Actor-Critic Algorithm
    Kim, Youngjae
    Hussain, Manzoor
    Suh, Jae-Won
    Hong, Jang-Eui
    2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2022, : 320 - 325
  • [40] REINFORCEMENT LEARNING-BASED ADAPTIVE MOTION CONTROL FOR AUTONOMOUS VEHICLES VIA ACTOR-CRITIC STRUCTURE
    Wang, Honghai
    Wei, Liangfen
    Wang, Xianchao
    He, Shuping
    DISCRETE AND CONTINUOUS DYNAMICAL SYSTEMS-SERIES S, 2024, 17 (09): : 2894 - 2911