Actor-Q based active perception learning system

被引：0

作者：

Shibata, K ^{[1
]}

Nishino, T ^{[1
]}

Okabe, Y ^{[1
]}

机构：

[1] Oita Univ, Dept Elect & Elect Engn, Oita 8701192, Japan

来源：

2001 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS I-IV, PROCEEDINGS | 2001年

关键词：

Actor-Q architecture; reinforcement learning; neural network; active perception; visual sensor;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

An active perception learning system based on reinforcement learning is proposed. A novel reinforcement architecture called Actor-Q is employed in which Q-learning and Actor-Critic are combined. The system decides its actions according to Q-values. One of the actions is to move its sensor and the others are to make all answer of its recognition result, each of which corresponds to each pattern. When the sensor motion is selected the sensor moves according to thc actor's output signals. The Q-value for the sensor motion is trained by Q-learning. and the Actor is trained hy the Q-value for the sensor motion on behalf of the critic When one of the other actions is selected the system outputs the recognition result. When the recognition answer is correct, the Q-value is trained to be the upper limit of the Q-value, and when the answer is not correct, it is trained to be 0.0. The module to compute Q-value and the actor module are both consisted of a neural network and are trained by Error Back Propagation. The training signals are generated based on the above reinforcement learning. It was confirmed by some simulations using a visual sensor with non-uniform visual cells that the system moves its sensor to the place where it can recognize the presented pattern correctly. Even though the Q-value surface as a function of the sensor location has some local peaks. the sensor was not trapped and moved to the appropriate direction because the Q-value for the sensor motion becomes larger.

引用

页码：1000 / 1005

页数：6

共 50 条

[31] A active vibration control strategy based on reinforcement learning
Zhou J.
Dong L.
Meng C.
Sun H.
Dong, Longlei, 1600, Chinese Vibration Engineering Society (40): : 281 - 286
[32] Dynamic Actor-critic: Reinforcement Learning based Radio Resource Scheduling For LTE-Advanced
Tathe, Pallavi K.
Sharma, Manish
2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
[33] Manipulator Motion Planning based on Actor-Critic Reinforcement Learning
Li, Qiang
Nie, Jun
Wang, Haixia
Lu, Xiao
Song, Shibin
2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 4248 - 4254
[34] A Deep Q-learning based Path Planning and Navigation System for Firefighting Environments
Bhattarai, Manish
Martinez-Ramon, Manel
ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 2021, : 267 - 277
[35] Load Frequency Active Disturbance Rejection Control for Multi-Source Power System Based on Soft Actor-Critic
Zheng, Yuemin
Tao, Jin
Sun, Hao
Sun, Qinglin
Chen, Zengqiang
Dehmer, Matthias
Zhou, Quan
ENERGIES, 2021, 14 (16)
[36] Q-learning system based on cooperative least squares support vector machine
School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou 221116, China
不详
Zidonghua Xuebao Acta Auto. Sin., 2009, 2 (214-219): : 214 - 219
[37] A Distributed Topology Access Strategy Based on Q-learning in a WDM VLC System
Wang, Liqiang
Han, Dahai
Zhang, Min
Chen, Qiguan
2022 ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE, ACP, 2022, : 495 - 497
[38] A Q-learning System for Container Marshalling with Group-Based Learning Model at Container Yard Terminals
Hirashima, Yoichi
IMECS 2009: INTERNATIONAL MULTI-CONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2009, : 162 - 167
[39] Evaluating Correctness of Reinforcement Learning based on Actor-Critic Algorithm
Kim, Youngjae
Hussain, Manzoor
Suh, Jae-Won
Hong, Jang-Eui
2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2022, : 320 - 325
[40] REINFORCEMENT LEARNING-BASED ADAPTIVE MOTION CONTROL FOR AUTONOMOUS VEHICLES VIA ACTOR-CRITIC STRUCTURE
Wang, Honghai
Wei, Liangfen
Wang, Xianchao
He, Shuping
DISCRETE AND CONTINUOUS DYNAMICAL SYSTEMS-SERIES S, 2024, 17 (09): : 2894 - 2911

← 1 2 3 4 5 →