Development Of Deep Reinforcement Learning Multi-Agent Framework Design Using Self-Organizing Map

被引：0

作者：

Setyawan, Gembong Edhi ^{[1
]}

Cholissodin, Imam ^{[1
]}

机构：

[1] Univ Brawijaya, Fac Comp Sci, Malang, Indonesia

来源：

PROCEEDINGS OF 2019 4TH INTERNATIONAL CONFERENCE ON SUSTAINABLE INFORMATION ENGINEERING AND TECHNOLOGY (SIET 2019) | 2019年

关键词：

framework design; deep reinforcement learning; q-learning; multi-agent; artificial neural network; self-organizing map;

D O I：

10.1109/siet48054.2019.8986121

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The developmental steps and paradigm changes in the use of automation technology using deep reinforcement learning (RL) are very rapid because they are also widely accompanied by the development of deep learning combination, which combines RL algorithms. One of the combinations is Q-learning algorithm with one of the deep learning algorithms family of artificial neural networks (ANN) and part of the artificial intelligence science. The combination also becomes a challenge for many researchers because so far it is very difficult to find the right combination in accordance with the case resolved although there are also those that combine with non-ANN. In addition, most RLs only use a single combination, which means that they have not found the ideal combination, whether it should be a single one of the algorithms of ANN or some of it. This study proposes a framework design using the Self-Organizing Map (SOM) algorithm that adaptively combines and plays as the actor to calculate the final Q-value value that is updated from a single or multiple Q-value values in a sustainable and dynamic manner. The result of the formed framework indicates that SOM is able to provide an adaptive combination for the algorithms that should be used in deep RL.

引用

页码：246 / 250

页数：5

共 11 条

[1]

[Anonymous], Deep Reinforcement Learning: Pong from Pixels

[2]

Ardiansyah A., 2017, JNTETI, V6

[3]

Busoniu L., 2006, Proceedings of the 9th International Conference on Control, Automation, Robotics and Vision, Singapore, P527

[4]

Chitta R., 2011, P 17 ACM SIGKDD INT, P895, DOI [DOI 10.1145/2020408.2020558, 10.1145/2020408.2020558]

[5]

Cholissodin I., 2016, SWARM INTELLIGENCE

[6]

Egorov Maxim, 2016, CS231N CONVOLUTIONAL

[7]

Jiang S., MULTIAGENT REINFORCE

[8]

Kartika P., 2016, J ENV ENG SUSTAINABL, V03, P42

[9]

Lin X., 2014, ARXIV14036822V1CSLG

[10]

Punma C., 2017, AUTONOMOUS VEHICLE F

← 1 2 →