Modular Q-learning based multi-agent cooperation for robot soccer

被引：77

作者：

Park, KH ^{[1
]}

Kim, YJ ^{[1
]}

Kim, JH ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol, Dept Elect Engn & Comp Sci, Yusong Gu, Taejon 305701, South Korea

来源：

ROBOTICS AND AUTONOMOUS SYSTEMS | 2001年 / 35卷 / 02期

关键词：

multi-agent system; robot soccer system; reinforcement learning; modular Q-learning; action selection;

D O I：

10.1016/S0921-8890(01)00114-2

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In a multi-agent system, action selection is important for the cooperation and coordination among agents. As the environment is dynamic and complex, modular Q-learning, which is one of the reinforcement learning schemes, is employed in assigning a proper action to an agent in the multi-agent system. The architecture of modular Q-learning consists of learning modules and a mediator module. The mediator module of the modular Q-learning system selects a proper action for the agent based on the Q-value obtained from each learning module. To obtain better performance, along with the Q-value, the mediator module also considers the state information in the action selection process. A uni-vector field is used for robot navigation. In the robot soccer environment, the effectiveness and applicability of modular Q-learning and the uni-vector field method are verified by real experiments using five micro-robots. (C) 2001 Elsevier Science B.V. All rights reserved.

引用

页码：109 / 122

页数：14

共 17 条

[1]

BOUTILIER C, 1996, P 6 C THEOR ASP RAT

[2]

Caironi PVC, 1997, INT J INTELL SYST, V12, P695, DOI 10.1002/(SICI)1098-111X(199710)12:10<695::AID-INT1>3.0.CO

[3]

2-T

[4] Infrared spectrum of the bis-(1,10-phenanthroline) Cu(I) and Cu(II) perchlorate complexes [J].

CamposVallette, MM ;

Clavijo, RE ;

Mendizabal, F ;

Zamudio, W ;

Baraona, R ;

Diaz, G .

VIBRATIONAL SPECTROSCOPY, 1996, 12 (01) :37-44

[5] Reinforcement learning: A survey [J].

Kaelbling, LP ;

Littman, ML ;

Moore, AW .

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 :237-285

[6]

Kim JH, 1998, IEEE INT CONF ROBOT, P3216, DOI 10.1109/ROBOT.1998.680920

[7]

Kim JS, 1996, IEEE INT CONF ROBOT, P635, DOI 10.1109/ROBOT.1996.503846

[8]

KIM YJ, 1998, P 2 AS PAC C SIM EV

[9]

Kube C. Ronald, 1993, Adaptive Behavior, V2, P189, DOI 10.1177/105971239300200204

[10]

Lee S, 1998, IEEE INT CONF ROBOT, P2599, DOI 10.1109/ROBOT.1998.680734

← 1 2 →