Multi-robot Collaboration Based on Markov Decision Process in Robocup3D Soccer Simulation Game

被引：0

作者：

Cui Xuanyu ^{[1
]}

Liang Zhiwei ^{[1
]}

Yang Yongyi ^{[1
]}

Shen Ping ^{[1
]}

Wang Jiawen ^{[1
]}

Liu Haoran ^{[1
]}

Fan Kai ^{[1
]}

机构：

[1] Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing 210046, Jiangsu, Peoples R China

来源：

2015 27TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC) | 2015年

关键词：

Markov Decision Process; Sarsa Algorithm; Reinforcement learning; Dynamic role assignment; RoboCup;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Close collaboration and desired strategy is indispensable for humanoid robots in the RoboCup soccer competition. In order to solve the problem that the convergence rate is too low in training local strategies,this paper mainly proposed a method to optimize the parameters in decision and positioning based on reinforcement learning for soccer robots. First, Markov decision process is applied to the framework for reinforcement learning. Then,we propose a relative improved method, which is known as a Sarsa Algorithm to overcome the drawback of the low convergence rate of the average reward reinforcement learning. Meanwhile, in order to deal with the large state space problems arising in the training and improve the generalization ability, this method is applied to the Keepaway local training. The training results show that, this algorithm has a faster convergent speed than other ordinary learning algorithm.

引用

页码：4345 / 4349

页数：5

共 2 条

[1] A Reinforcement Learning Approach to Score Goals in RoboCup 3D Soccer Simulation for Nao Humanoid Robot
Fahami, Mohammad Amin
Roshanzamir, Mohamad
Izadi, Navid Hoseini
PROCEEDINGS OF THE 2017 7TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2017, : 450 - 454
[2] Markov decision process based multi-round negotiation in manufacturing service collaboration under dynamic pressure conditions
Liu, Bo
Zhang, Yongping
Sun, Hanlin
Sheng, Guojun
Zou, Xiaofu
Cheng, Ying
Tao, Fei
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 277

← 1 →