Reinforcement learning soccer teams with incomplete world models

被引:19
作者
Wiering, M [1 ]
Salustowicz, R [1 ]
Schmidhuber, J [1 ]
机构
[1] IDSIA, CH-6900 Lugano, Switzerland
关键词
reinforcement learning; CMAC; world models; simulated soccer; Q(lambda); evolutionary computation; PIPE;
D O I
10.1023/A:1008921914343
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We use reinforcement learning (RL) to compute strategies for multiagent soccer teams. RL may profit significantly from world models (WMs) estimating state transition probabilities and rewards. In high-dimensional, continuous input spaces, however, learning accurate WMs is intractable. Here we show that incomplete WMs can help to quickly find good action selection policies. Our approach is based on a novel combination of CMACs and prioritized sweeping-like algorithms. Variants thereof outperform both Q(lambda)-learning with CMACs and the evolutionary method Probabilistic Incremental Program Evolution (PIPE) which performed best in previous comparisons.
引用
收藏
页码:77 / 88
页数:12
相关论文
共 38 条
[21]  
SALUSTOWICZ RP, 1997, LECT NOTES COMPUTER, V1327, P769
[22]  
SALUSTOWICZ RP, 1997, P 4 INT C NEUR INF P, P502
[23]  
SAMUEL AL, 1959, IBM J RES DEV, V3, P210
[24]  
SANTAMARIA JC, 1996, 96088 CIONS GEORG I
[25]   Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement [J].
Schmidhuber, J ;
Zhao, JY ;
Wiering, M .
MACHINE LEARNING, 1997, 28 (01) :105-130
[26]  
SCHMIDHUBER J, 1995, FKI19894 U MUNCH
[27]  
SCHMIDHUBER J, 1997, LEARNING LEARN, P293
[28]  
Singh SP, 1996, MACH LEARN, V22, P123, DOI 10.1007/BF00114726
[29]  
Sutton R., 1988, Reinforcement Learning: An Introduction
[30]  
Sutton R. S., 1988, Machine Learning, V3, P9, DOI 10.1023/A:1022633531479