Shaping multi-agent systems with gradient reinforcement learning

被引:25
作者
Buffet, Olivier
Dutech, Alain
Charpillet, Francois
机构
[1] CNRS, LAAS, Grp RIS, F-31077 Toulouse 4, France
[2] Loria INRIA Lorraine, F-54506 Vandoeuvre Les Nancy, France
关键词
reinforcement learning; multi-agent systems; partially observable Markov decision processes; shaping; policy-gradient;
D O I
10.1007/s10458-006-9010-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An original reinforcement learning (RL) methodology is proposed for the design of multi-agent systems. In the realistic setting of situated agents with local perception, the task of automatically building a coordinated system is of crucial importance. To that end, we design simple reactive agents in a decentralized way as independent learners. But to cope with the difficulties inherent to RL used in that framework, we have developed an incremental learning algorithm where agents face a sequence of progressively more complex tasks. We illustrate this general framework by computer experiments where agents have to coordinate to reach a global goal.
引用
收藏
页码:197 / 220
页数:24
相关论文
共 53 条
[1]  
[Anonymous], 2003, MULTI AGENT REINFORC
[2]  
[Anonymous], 1998, P 15 INT C MACH LEAR
[3]   Purposive behavior acquisition for a real robot by vision-based reinforcement learning [J].
Asada, M ;
Noda, S ;
Tawaratsumida, S ;
Hosoda, K .
MACHINE LEARNING, 1996, 23 (2-3) :279-303
[4]  
BARTLETT PL, 1999, HEBBIAN SYNAPTIC MOD
[5]   Infinite-horizon policy-gradient estimation [J].
Baxter, J ;
Bartlett, PL .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 15 :319-350
[6]   Experiments with infinite-horizon, policy-gradient estimation [J].
Baxter, J ;
Bartlett, PL ;
Weaver, L .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 15 :351-381
[7]   The complexity of decentralized control of Markov decision processes [J].
Bernstein, DS ;
Givan, R ;
Immerman, N ;
Zilberstein, S .
MATHEMATICS OF OPERATIONS RESEARCH, 2002, 27 (04) :819-840
[8]  
Bertsekas D. P., 1996, Neuro Dynamic Programming, V1st
[9]  
Boutilier C, 1996, THEORETICAL ASPECTS OF RATIONALITY AND KNOWLEDGE, P195
[10]  
Buffet O, 2004, From Animals to Animats 8, P223