Actor-Q based active perception learning system

被引：0

作者：

Shibata, K ^{[1
]}

Nishino, T ^{[1
]}

Okabe, Y ^{[1
]}

机构：

[1] Oita Univ, Dept Elect & Elect Engn, Oita 8701192, Japan

来源：

2001 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS I-IV, PROCEEDINGS | 2001年

关键词：

Actor-Q architecture; reinforcement learning; neural network; active perception; visual sensor;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

An active perception learning system based on reinforcement learning is proposed. A novel reinforcement architecture called Actor-Q is employed in which Q-learning and Actor-Critic are combined. The system decides its actions according to Q-values. One of the actions is to move its sensor and the others are to make all answer of its recognition result, each of which corresponds to each pattern. When the sensor motion is selected the sensor moves according to thc actor's output signals. The Q-value for the sensor motion is trained by Q-learning. and the Actor is trained hy the Q-value for the sensor motion on behalf of the critic When one of the other actions is selected the system outputs the recognition result. When the recognition answer is correct, the Q-value is trained to be the upper limit of the Q-value, and when the answer is not correct, it is trained to be 0.0. The module to compute Q-value and the actor module are both consisted of a neural network and are trained by Error Back Propagation. The training signals are generated based on the above reinforcement learning. It was confirmed by some simulations using a visual sensor with non-uniform visual cells that the system moves its sensor to the place where it can recognize the presented pattern correctly. Even though the Q-value surface as a function of the sensor location has some local peaks. the sensor was not trapped and moved to the appropriate direction because the Q-value for the sensor motion becomes larger.

引用

页码：1000 / 1005

页数：6

共 50 条

[41] Q-ac:: Multiagent reinforcement learning with perception-conversion action
Sun, R
Tatsumi, S
Zhao, G
2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 2950 - 2955
[42] An Online Home Energy Management System using Q-Learning and Deep Q-Learning
Izmitligil, Hasan
Karamancioglu, Abdurrahman
SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2024, 43
[43] Active Perception-Based Haptic Texture Sensor
Song, Aiguo
Han, Yezhen
Hu, Haihua
Tian, Lei
Wu, Juan
SENSORS AND MATERIALS, 2013, 25 (01) : 1 - 15
[44] Model-Based Approaches to Active Perception and Control
Pezzulo, Giovanni
Donnarumma, Francesco
Iodice, Pierpaolo
Maisto, Domenico
Stoianov, Ivilin
ENTROPY, 2017, 19 (06)
[45] Adaptive Model Learning Based on Dyna-Q Learning
Hwang, Kao-Shing
Jiang, Wei-Cheng
Chen, Yu-Jen
CYBERNETICS AND SYSTEMS, 2013, 44 (08) : 641 - 662
[46] Practical Critic Gradient based Actor Critic for On-Policy Reinforcement Learning
Gurumurthy, Swaminathan
Manchester, Zachary
Kolter, J. Zico
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
[47] An actor-critic learning framework based on Lyapunov stability for automatic assembly
Xinwang Li
Juliang Xiao
Yu Cheng
Haitao Liu
Applied Intelligence, 2023, 53 : 4801 - 4812
[48] Asynchronous learning for actor-critic neural networks and synchronous triggering for multiplayer system
Wang, Ke
Mu, Chaoxu
ISA TRANSACTIONS, 2022, 129 : 295 - 308
[49] Fuzzy Q Learning based UAV Autopilot
Sharma, Rajneesh
2014 INNOVATIVE APPLICATIONS OF COMPUTATIONAL INTELLIGENCE ON POWER, ENERGY AND CONTROLS WITH THEIR IMPACT ON HUMANITY (CIPECH), 2014, : 29 - 33
[50] Active fault-tolerant attitude control based on Q-learning for rigid spacecraft with actuator faults
Rafiee, Sajad
Kankashvar, Mohammadrasoul
Mohammadi, Parisa
Bolandi, Hossein
ADVANCES IN SPACE RESEARCH, 2024, 74 (03) : 1261 - 1275

← 1 2 3 4 5 →