Modeling and Algorithms of Multi-agent Reinforcement Learning Using Stochastic Game

被引：0

作者：

Xie Guangqiang ^{[1
,2
]}

Chen Xuesong ^{[1
,3
]}

机构：

[1] Guangdong Univ Technol, Fac Automat, Guangzhou 510006, Guangdong, Peoples R China

[2] Guangdong Univ Technol, Fac Comp, Guangzhou 510006, Guangdong, Peoples R China

[3] Guangdong Univ Technol, Fac Appl Math, Guangzhou 510006, Guangdong, Peoples R China

来源：

2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL VII | 2010年

关键词：

reinforcement learning; multi-agent systems; convergence; stochastic games; SYSTEMS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning has been successful at finding optimal control policies through trial-and-error interaction with dynamic environment. Its properties of self-improving and online learning make reinforcement learning become one of most important machine learning methods. First of all, we survey the foundation, mathematical models of environment of reinforcement learning; The convergence and reward of the algorithms are discussed in the next; Then two traditional algorithms are deeply discussed in detail, including Q-Learning(QL) and Minimax Q-Learning(MQL), and a new algorithm of Opponent Modeling Q-Learnig(OMQL)is formed based on MQL. Finally, the simulation results demonstrate that the model could converge to optimal strategy steadily, moreover, the model could improve the learning performance and speed up the convergence of the learning process.

引用

页码：375 / 378

页数：4

共 9 条

[1]

[Anonymous], 2000, Multiagent Systems: AModern Approach to DistributedArtificial Intelligence

[2]

Arai S., 2000, Proceedings of the Fourth International Conference on Autonomous Agents, P104, DOI 10.1145/336595.337062

[3]

Bowling M., 2003, MULTIAGENT LEARNING

[4] A comprehensive survey of multiagent reinforcement learning [J].