Multi-agent Behavior-Based Policy Transfer

被引:15
作者
Didi, Sabre [1 ]
Nitschke, Geoff [1 ]
机构
[1] Univ Cape Town, Dept Comp Sci, ZA-7700 Cape Town, South Africa
来源
APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2016, PT II | 2016年 / 9598卷
关键词
Multi-agent learning; Evolutionary algorithms; Transfer learning; Behavioural diversity adaptation; NETWORKS; SOCCER;
D O I
10.1007/978-3-319-31153-1_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A key objective of transfer learning is to improve and speed-up learning on a target task after training on a different, but related, source task. This study presents a neuro-evolution method that transfers evolved policies within multi-agent tasks of varying degrees of complexity. The method incorporates behavioral diversity (novelty) search as a means to boost the task performance of transferred policies (multi-agent behaviors). Results indicate that transferred evolved multi-agent behaviors are significantly improved in more complex tasks when adapted using behavioral diversity. Comparatively, behaviors that do not use behavioral diversity to further adapt transferred behaviors, perform relatively poorly in terms of adaptation times and quality of solutions in target tasks. Also, in support of previous work, both policy transfer methods (with and without behavioral diversity adaptation), out-perform behaviors evolved in target tasks without transfer learning.
引用
收藏
页码:181 / 197
页数:17
相关论文
共 38 条
[1]  
Ammar H., 2012, P 11 INT C AUT AG MU, P4
[2]  
[Anonymous], 1986, NUMERICAL RECIPES
[3]  
[Anonymous], 2010, PROC 12 ANN C GENETI, DOI DOI 10.1145/1830483.1830638
[4]   Transfer of Evolved Pattern-Based Heuristics in Games [J].
Bahceci, Erkin ;
Miikkulainen, Risto .
2008 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND GAMES, 2008, :220-227
[5]  
Boutsioukis Georgios, 2012, Recent Advances in Reinforcement Learning. 9th European Workshop (EWRL 2011). Revised Selected Papers, P249, DOI 10.1007/978-3-642-29946-9_25
[6]  
Cuccu G, 2011, IEEE C EVOL COMPUTAT, P158
[7]   Scalable multiagent learning through indirect encoding of policy geometry [J].
D'Ambrosio, David B. ;
Stanley, Kenneth O. .
EVOLUTIONARY INTELLIGENCE, 2013, 6 (01) :1-26
[8]  
Deb K., 2001, PARETO BASED MULTIOB
[9]   Transfer learning of gaits on a quadrupedal robot [J].
Degrave, Jonas ;
Burm, Michael ;
Kindermans, Pieter-Jan ;
Dambre, Joni ;
Wyffels, Francis .
ADAPTIVE BEHAVIOR, 2015, 23 (02) :69-82
[10]  
Doncieux S., 2014, AAAI 2014 FALL S KNO, P1