A multi-agent reinforcement learning using Actor-Critic methods

被引：7

作者：

Li, Chun-Gui ^{[1
]}

Wang, Meng ^{[1
]}

Yuan, Qing-Neng ^{[1
]}

机构：

[1] Guangxi Univ Technol, Dept Comp Engn, Liuzhou 545006, Peoples R China

来源：

PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7 | 2008年

关键词：

multi-agent; reinforcement learning; Actor-Critic methods; temporal best-response strategy; Nash equilibrium;

D O I：

10.1109/ICMLC.2008.4620528

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper investigates a new algorithm in Multi-agent Reinforcement Learning. We propose a multi-agent learning algorithm that is extend single agent Actor-Critic methods to the multi-agent setting. To realize the algorithm, we introduced the value of agent's temporal best-response strategy instead of the value of an equilibria. So, our algorithm uses the linear programming to compute Q values. When there are multi Nash equilibrium in the games, the mixed equilibrium was be reached. Our learning algorithm works within very general framework of n-player, general-sum stochastic games, and learns both the game structure and its associated optimal policy.

引用

页码：878 / 882

页数：5

共 13 条

[1]

Claus C, 1998, FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, P746

[2]

Greenwald A, 2003, P 20 INT C MACH LEAR, V3, P242

[3]

Hu J., 2003, J MACHINE LEARNING R, V4, P1039, DOI DOI 10.5555/945365.964288

[4]

KAPETANAKIS S, 2004, P 3 INT JOINT C AUT, P1258

[5]

Littman M. L., 2001, ICML, P322, DOI DOI 10.5555/645530.655661

[6]

LITTMAN ML, 1994, P 11 INT C MACH LEAR, P157

[7]

LITTMAN ML, 1996, P 13 INT C MACH LEAR, P310

[8] If multi-agent learning is the answer, what is the question? [J].

Shoham, Yoav ;

Powers, Rob ;

Grenager, Trond .

ARTIFICIAL INTELLIGENCE, 2007, 171 (07) :365-377

[9] Multiagent systems: A survey from a machine learning perspective [J].

Stone, P ;

Veloso, M .

AUTONOMOUS ROBOTS, 2000, 8 (03) :345-383

[10]

Suematsu N., 2002, Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems, P370

← 1 2 →