Modular Q-learning based multi-agent cooperation for robot soccer

被引:77
作者
Park, KH [1 ]
Kim, YJ [1 ]
Kim, JH [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Dept Elect Engn & Comp Sci, Yusong Gu, Taejon 305701, South Korea
关键词
multi-agent system; robot soccer system; reinforcement learning; modular Q-learning; action selection;
D O I
10.1016/S0921-8890(01)00114-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In a multi-agent system, action selection is important for the cooperation and coordination among agents. As the environment is dynamic and complex, modular Q-learning, which is one of the reinforcement learning schemes, is employed in assigning a proper action to an agent in the multi-agent system. The architecture of modular Q-learning consists of learning modules and a mediator module. The mediator module of the modular Q-learning system selects a proper action for the agent based on the Q-value obtained from each learning module. To obtain better performance, along with the Q-value, the mediator module also considers the state information in the action selection process. A uni-vector field is used for robot navigation. In the robot soccer environment, the effectiveness and applicability of modular Q-learning and the uni-vector field method are verified by real experiments using five micro-robots. (C) 2001 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:109 / 122
页数:14
相关论文
共 17 条
[1]  
BOUTILIER C, 1996, P 6 C THEOR ASP RAT
[2]  
Caironi PVC, 1997, INT J INTELL SYST, V12, P695, DOI 10.1002/(SICI)1098-111X(199710)12:10<695::AID-INT1>3.0.CO
[3]  
2-T
[4]   Infrared spectrum of the bis-(1,10-phenanthroline) Cu(I) and Cu(II) perchlorate complexes [J].
CamposVallette, MM ;
Clavijo, RE ;
Mendizabal, F ;
Zamudio, W ;
Baraona, R ;
Diaz, G .
VIBRATIONAL SPECTROSCOPY, 1996, 12 (01) :37-44
[5]   Reinforcement learning: A survey [J].
Kaelbling, LP ;
Littman, ML ;
Moore, AW .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 :237-285
[6]  
Kim JH, 1998, IEEE INT CONF ROBOT, P3216, DOI 10.1109/ROBOT.1998.680920
[7]  
Kim JS, 1996, IEEE INT CONF ROBOT, P635, DOI 10.1109/ROBOT.1996.503846
[8]  
KIM YJ, 1998, P 2 AS PAC C SIM EV
[9]  
Kube C. Ronald, 1993, Adaptive Behavior, V2, P189, DOI 10.1177/105971239300200204
[10]  
Lee S, 1998, IEEE INT CONF ROBOT, P2599, DOI 10.1109/ROBOT.1998.680734