Cooperative strategy learning in multi-agent environment with continuous state space

被引:0
作者
Tao, Jun-Yuan [1 ]
Li, De-Sheng [2 ]
机构
[1] Harbin Inst Technol, Dept Automat Measurement & Control, Harbin 150006, Peoples R China
[2] Beijing Univ Technol, Dept Mech & Elect Engn, Beijing, Peoples R China
来源
PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7 | 2006年
关键词
reinforcement learning; multi-agent; cooperative behavior; continuous state space;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning is a powerful method for solving sequential decision making problems. But it is difficult to apply to practical problems such as multi-agent systems with continuous state space problems. In this paper we present a cooperative strategy learning method to solve the problem. It combines WOLF-PHC algorithms with function approximation of RL techniques. By this method an agent could learn cooperative behavior in the multi-agent environment with continuous state space. Using a subtask of RoboCup soccer, Keepaway, we demonstrate the effective of this learning method and the experiment results show that the algorithm converges.
引用
收藏
页码:2107 / +
页数:2
相关论文
共 16 条
[1]  
Baird L, 1995, MACHINE LEARNING P 1, P30
[2]   Multiagent learning using a variable learning rate [J].
Bowling, M ;
Veloso, M .
ARTIFICIAL INTELLIGENCE, 2002, 136 (02) :215-250
[3]  
BOYAN JA, 1995, ADV NEURAL INFORM PR, V7
[4]  
Claus C, 1998, FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, P746
[5]  
GORDON G, 2003, ADV NEURAL INFORM PR, V13
[6]  
Hu J., 2003, J MACHINE LEARNING R, V4, P1039, DOI DOI 10.5555/945365.964288
[7]   Reinforcement learning: A survey [J].
Kaelbling, LP ;
Littman, ML ;
Moore, AW .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 :237-285
[8]  
Kitano H, 1997, AI MAG, V18, P73
[9]  
KITANO H, P 15 INT JOINT C ART, P24
[10]  
LITTMAN ML, 1994, P 11 INT C MACH LEAR, P157