Learning sequences of compatible actions among agents

被引:4
作者
Polat, F [1 ]
Abul, O [1 ]
机构
[1] Middle E Tech Univ, Dept Comp Engn, TR-06531 Ankara, Turkey
关键词
bucket brigade learning; multiagent learning; multiagent systems; Q-learning; reinforcement learning;
D O I
10.1023/A:1015009422110
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Action coordination in multiagent systems is a difficult task especially in dynamic environments. If the environment possesses cooperation, least communication, incompatibility and local information constraints, the task becomes even more difficult. Learning compatible action sequences to achieve a designated goal under these constraints is studied in this work. Two new multiagent learning algorithms called QACE and NoCommQACE are developed. To improve the performance of the QACE and NoCommQACE algorithms four heuristics, state iteration, means-ends analysis, decreasing reward and do-nothing, are developed. The proposed algorithms are tested on the blocks world domain and the performance results are reported.
引用
收藏
页码:21 / 37
页数:17
相关论文
共 22 条
[1]   Function approximation based multi-agent reinforcement learning [J].
Abul, O ;
Polat, F ;
Alhajj, R .
12TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2000, :36-39
[2]   Multiagent reinforcement learning using function approximation [J].
Abul, O ;
Polat, F ;
Alhajj, R .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2000, 30 (04) :485-497
[3]  
[Anonymous], 1999, Learning in Graphical Models
[4]  
CICHOSZ P, 1997, THESIS WARSAW U TECH
[5]  
HU J, 1998, P 15 INT C MACH LEAR, P242
[6]  
HUHNS MN, 1997, READINGS AGENTS
[7]   Planning and acting in partially observable stochastic domains [J].
Kaelbling, LP ;
Littman, ML ;
Cassandra, AR .
ARTIFICIAL INTELLIGENCE, 1998, 101 (1-2) :99-134
[8]  
KUTER U, 2000, P EUR C ART INT ECAI, P50
[9]  
LITTMAN ML, 1994, P 11 INT C MACH LEAR, P157
[10]  
POLAT F, 1994, LECT NOTES ARTIF INT, V130, P279