Towards Large Scale Ad-hoc Teamwork

被引:0
作者
Yourdshahi, Elnaz Shafipour [1 ]
Pinder, Thomas [1 ]
Dhawan, Gauri [2 ]
Marcolino, Leandro Soriano [1 ]
Angelov, Plamen [1 ]
机构
[1] Univ Lancaster, Sch Comp & Commun, Lancaster, England
[2] Vellore Inst Technol, Comp Sci Dept, Madras, Tamil Nadu, India
来源
2018 IEEE INTERNATIONAL CONFERENCE ON AGENTS (ICA) | 2018年
关键词
Collaborative Intelligence; Learning (Artificial Intelligence); Algorithms;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In complex environments, agents must be able to cooperate with previously unknown team-mates, and hence dynamically learn about other agents in the environment while searching for optimal actions. Previous works employ Monte Carlo Tree Search approaches. However, the search tree increases exponentially with the number of agents, and only scenarios with very small team sizes have been explored. Hence, in this paper we propose a history-based version of UCT Monte Carlo Tree Search, using a more compact representation than the original algorithm. We perform several experiments with a varying number of agents in the level-based foraging domain, an important testbed for ad-hoc teamwork. We achieve better overall performance than the state-of-the-art and better scalability with team size. Additionally, we contribute an open-source version of our system, making it easier for the research community to use the level-based foraging domain as a benchmark problem for ad-hoc teamwork.
引用
收藏
页码:44 / 49
页数:6
相关论文
共 18 条
[1]  
Albrecht S., 2017, AAMAS 17
[2]  
Albrecht S., 2015, P 29 AAAI C ART INT
[3]  
Albrecht S. V., 2016, J ARTIFICIAL INTELLI, V55
[4]   Autonomous agents modelling other agents: A comprehensive survey and open problems [J].
Albrecht, Stefano V. ;
Stone, Peter .
ARTIFICIAL INTELLIGENCE, 2018, 258 :66-95
[5]  
[Anonymous], P 24 C ART INT AAAI
[6]  
[Anonymous], TECH REP
[7]  
[Anonymous], 2010, NEURAL INFORM PROCES
[8]   Finite-time analysis of the multiarmed bandit problem [J].
Auer, P ;
Cesa-Bianchi, N ;
Fischer, P .
MACHINE LEARNING, 2002, 47 (2-3) :235-256
[9]  
Barrett S., 2011, P 11 INT C AUT AG MU, P1
[10]  
Barrett S., 2013, P AAAI C ART INT