Reinforcement Learning based on State Space Model using Growing Neural Gas for a Mobile Robot

被引：2

作者：

Arai, Tomoyuki ^{[1
]}

Toda, Yuichiro ^{[2
]}

Iwasa, Mutsumi ^{[1
]}

Shao, Shuai ^{[1
]}

Tonomura, Ryuta ^{[1
]}

Kubota, Naoyuki ^{[1
]}

机构：

[1] Tokyo Metropolitan Univ, Grad Sch Syst Design, Tokyo, Japan

[2] Okayama Univ, Grad Sch Nat Sci Technol, Okayama, Japan

来源：

2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS) | 2018年

关键词：

Reinforcement Learning; Self-organization; Machine Learning; Mobile Robot; State Space;

D O I：

10.1109/SCIS-ISIS.2018.00220

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the application of Reinforcement Learning to real tasks, a state space construction is an important problem. In order to use in real world environment, we need to deal with the problem of continuous information. Therefore, we proposed a Growing Neural Gas method based on state space construction model. In our system, the agent constructs State Space Model from its own experience autonomously. Furthermore, it can reconstruct a suitable state space to adapt complication of the environment. Through the experiments, we showed that our method using state space performs as well as the conventional method by using a smaller number of states.

引用

页码：1410 / 1413

页数：4

共 8 条

[1] [Anonymous], 2015, Reinforcement Learning: An Introduction
[2] Doya K., 1996, ADV NEURAL INFORM PR
[3] Fritzke B., 1997, Artificial Neural Networks - ICANN '97. 7th International Conference Proceedings, P613, DOI 10.1007/BFb0020222
[4] Fritzke B., 1995, Advances in Neural Information Processing Systems 7, P625
[5] THE SELF-ORGANIZING MAP
KOHONEN, T
[J]. PROCEEDINGS OF THE IEEE, 1990, 78 (09) : 1464 - 1480
[6] Applications of the self-organising map to reinforcement learning
Smith, AJ
[J]. NEURAL NETWORKS, 2002, 15 (8-9) : 1107 - 1124
[7] WATKINS CJCH, 1992, MACH LEARN, V8, P279, DOI 10.1007/BF00992698
[8] YAIRI T, 2000, IEEE RSJ INT C INT R, P891

← 1 →