Actor-Q based active perception learning system

被引:0
作者
Shibata, K [1 ]
Nishino, T [1 ]
Okabe, Y [1 ]
机构
[1] Oita Univ, Dept Elect & Elect Engn, Oita 8701192, Japan
来源
2001 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS I-IV, PROCEEDINGS | 2001年
关键词
Actor-Q architecture; reinforcement learning; neural network; active perception; visual sensor;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An active perception learning system based on reinforcement learning is proposed. A novel reinforcement architecture called Actor-Q is employed in which Q-learning and Actor-Critic are combined. The system decides its actions according to Q-values. One of the actions is to move its sensor and the others are to make all answer of its recognition result, each of which corresponds to each pattern. When the sensor motion is selected the sensor moves according to thc actor's output signals. The Q-value for the sensor motion is trained by Q-learning. and the Actor is trained hy the Q-value for the sensor motion on behalf of the critic When one of the other actions is selected the system outputs the recognition result. When the recognition answer is correct, the Q-value is trained to be the upper limit of the Q-value, and when the answer is not correct, it is trained to be 0.0. The module to compute Q-value and the actor module are both consisted of a neural network and are trained by Error Back Propagation. The training signals are generated based on the above reinforcement learning. It was confirmed by some simulations using a visual sensor with non-uniform visual cells that the system moves its sensor to the place where it can recognize the presented pattern correctly. Even though the Q-value surface as a function of the sensor location has some local peaks. the sensor was not trapped and moved to the appropriate direction because the Q-value for the sensor motion becomes larger.
引用
收藏
页码:1000 / 1005
页数:6
相关论文
共 50 条
  • [41] Q-ac:: Multiagent reinforcement learning with perception-conversion action
    Sun, R
    Tatsumi, S
    Zhao, G
    2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 2950 - 2955
  • [42] An Online Home Energy Management System using Q-Learning and Deep Q-Learning
    Izmitligil, Hasan
    Karamancioglu, Abdurrahman
    SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2024, 43
  • [43] Active Perception-Based Haptic Texture Sensor
    Song, Aiguo
    Han, Yezhen
    Hu, Haihua
    Tian, Lei
    Wu, Juan
    SENSORS AND MATERIALS, 2013, 25 (01) : 1 - 15
  • [44] Model-Based Approaches to Active Perception and Control
    Pezzulo, Giovanni
    Donnarumma, Francesco
    Iodice, Pierpaolo
    Maisto, Domenico
    Stoianov, Ivilin
    ENTROPY, 2017, 19 (06)
  • [45] Adaptive Model Learning Based on Dyna-Q Learning
    Hwang, Kao-Shing
    Jiang, Wei-Cheng
    Chen, Yu-Jen
    CYBERNETICS AND SYSTEMS, 2013, 44 (08) : 641 - 662
  • [46] Practical Critic Gradient based Actor Critic for On-Policy Reinforcement Learning
    Gurumurthy, Swaminathan
    Manchester, Zachary
    Kolter, J. Zico
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [47] An actor-critic learning framework based on Lyapunov stability for automatic assembly
    Xinwang Li
    Juliang Xiao
    Yu Cheng
    Haitao Liu
    Applied Intelligence, 2023, 53 : 4801 - 4812
  • [48] Asynchronous learning for actor-critic neural networks and synchronous triggering for multiplayer system
    Wang, Ke
    Mu, Chaoxu
    ISA TRANSACTIONS, 2022, 129 : 295 - 308
  • [49] Fuzzy Q Learning based UAV Autopilot
    Sharma, Rajneesh
    2014 INNOVATIVE APPLICATIONS OF COMPUTATIONAL INTELLIGENCE ON POWER, ENERGY AND CONTROLS WITH THEIR IMPACT ON HUMANITY (CIPECH), 2014, : 29 - 33
  • [50] Active fault-tolerant attitude control based on Q-learning for rigid spacecraft with actuator faults
    Rafiee, Sajad
    Kankashvar, Mohammadrasoul
    Mohammadi, Parisa
    Bolandi, Hossein
    ADVANCES IN SPACE RESEARCH, 2024, 74 (03) : 1261 - 1275