Q-learning with exploration driven by internal dynamics in chaotic neural network

被引：1

作者：

Matsuki, Toshitaka ^{[1
]}

Inoue, Souya ^{[1
]}

Shibata, Katsunari ^{[1
]}

机构：

[1] Oita Univ, Fac Sci & Technol, Dept Innovat Engn, Oita, Japan

来源：

2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2020年

关键词：

Q-learning; exploration; chaotic neural network; chaos-based reinforcement learning; reservoir network; END SPEECH RECOGNITION; REPRESENTATIONS; STATE;

D O I：

10.1109/ijcnn48605.2020.9207114

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper shows chaos-based reinforcement learning (RL) using a chaotic neural network (NN) functions not only with Actor-Critic, but also with Q-learning. In chaos-based RL that we have proposed, exploration is performed based on internal dynamics in a chaotic NN and the dynamics is expected to grow rational through learning. Q-learning is a very popular RL method and widely used in several researches. We focused on whether Q-learning can be adopted to chaos-based RL. Then we demonstrated the agent can learn a goal task in a grid world environment with chaos-based RL using Q-learning. It was also shown that, as learning progresses, irregularity in the network outputs originated from the internal chaotic dynamics decreases and the agent can automatically switch from exploration mode to exploitation mode. Moreover, it was confirmed that the agent can adapt to changes in the environment and automatically resume exploration.

引用

页数：7

共 28 条

[1]

Amodei D, 2016, PR MACH LEARN RES, V48

[2]

[Anonymous], 2015, Reinforcement Learning: An Introduction

[3]

Babinec S, 2006, LECT NOTES COMPUT SC, V4131, P367

[4] Modeling reward functions for incomplete state representations via echo state networks [J].

Bush, K ;

Anderson, C .

Proceedings of the International Joint Conference on Neural Networks (IJCNN), Vols 1-5, 2005, :2995-3000

[5] THE PHYSIOLOGY OF PERCEPTION [J].

FREEMAN, WJ .

SCIENTIFIC AMERICAN, 1991, 264 (02) :78-85

[6] Influence of the Chaotic Property on Reinforcement Learning Using a Chaotic Neural Network [J].

Goto, Yuki ;

Shibata, Katsunari .

NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 :759-767

[7]

Graves A, 2014, PR MACH LEARN RES, V32, P1764

[8]

Hausknecht M, 2015, AAAI FALL S SEQUENTI

[9] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[10] Emergence of Complex Computational Structures From Chaotic Neural Networks Through Reward-Modulated Hebbian Learning [J].

Hoerzer, Gregor M. ;

Legenstein, Robert ;

Maass, Wolfgang .

CEREBRAL CORTEX, 2014, 24 (03) :677-690

← 1 2 3 →