A multi-agent reinforcement learning using Actor-Critic methods

被引:7
作者
Li, Chun-Gui [1 ]
Wang, Meng [1 ]
Yuan, Qing-Neng [1 ]
机构
[1] Guangxi Univ Technol, Dept Comp Engn, Liuzhou 545006, Peoples R China
来源
PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7 | 2008年
关键词
multi-agent; reinforcement learning; Actor-Critic methods; temporal best-response strategy; Nash equilibrium;
D O I
10.1109/ICMLC.2008.4620528
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper investigates a new algorithm in Multi-agent Reinforcement Learning. We propose a multi-agent learning algorithm that is extend single agent Actor-Critic methods to the multi-agent setting. To realize the algorithm, we introduced the value of agent's temporal best-response strategy instead of the value of an equilibria. So, our algorithm uses the linear programming to compute Q values. When there are multi Nash equilibrium in the games, the mixed equilibrium was be reached. Our learning algorithm works within very general framework of n-player, general-sum stochastic games, and learns both the game structure and its associated optimal policy.
引用
收藏
页码:878 / 882
页数:5
相关论文
共 13 条
[1]  
Claus C, 1998, FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, P746
[2]  
Greenwald A, 2003, P 20 INT C MACH LEAR, V3, P242
[3]  
Hu J., 2003, J MACHINE LEARNING R, V4, P1039, DOI DOI 10.5555/945365.964288
[4]  
KAPETANAKIS S, 2004, P 3 INT JOINT C AUT, P1258
[5]  
Littman M. L., 2001, ICML, P322, DOI DOI 10.5555/645530.655661
[6]  
LITTMAN ML, 1994, P 11 INT C MACH LEAR, P157
[7]  
LITTMAN ML, 1996, P 13 INT C MACH LEAR, P310
[8]   If multi-agent learning is the answer, what is the question? [J].
Shoham, Yoav ;
Powers, Rob ;
Grenager, Trond .
ARTIFICIAL INTELLIGENCE, 2007, 171 (07) :365-377
[9]   Multiagent systems: A survey from a machine learning perspective [J].
Stone, P ;
Veloso, M .
AUTONOMOUS ROBOTS, 2000, 8 (03) :345-383
[10]  
Suematsu N., 2002, Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems, P370