Emergent bluffing and inference with Monte Carlo Tree Search

被引：0

作者：

Cowling, Peter I. ^{[1
]}

Whitehouse, Daniel ^{[2
]}

Powley, Edward J. ^{[3
]}

机构：

[1] Univ York, Dept Comp Sci, York Ctr Complex Syst Anal, York YO10 5DD, N Yorkshire, England

[2] Hebe Works, Leeds, W Yorkshire, England

[3] Falmouth Univ, MetaMakers Inst, Acad Innovat & Res, Penryn, Cornwall, England

来源：

2015 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND GAMES (CIG) | 2015年

基金：

英国工程与自然科学研究理事会;

关键词：

INFORMATION; GAMES;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In many card and board games, players cannot see the whole game state, with different players seeing different parts of the state. In such games, gathering of information (inference) is a key strategic aspect, to which information hiding (bluffing, among other techniques) is an important countermeasure. Monte Carlo Tree Search (MCTS) is a powerful general-purpose technique for decision making in games. MCTS rose to prominence through successes in combinatorial board games such as Go, but more recently has demonstrated promise in card, board and video games of incomplete information. MCTS can construct robust plans in stochastic environments (making it strong in some games), but in its vanilla form is unable to infer or bluff (making it weak in games where this is a central feature). In this paper, we augment MCTS with mechanisms for performing inference and bluffing. Like all algorithms based on game tree search, MCTS implicitly constructs a model of the opponents' decision processes. We show that this model can be repurposed to perform an approximation of Bayesian inference. We also obtain bluffing behaviour by self-determinization (introducing "impossible" worlds into the agent's pool of sampled states). We test our algorithms on The Resistance, a popular card game based around hidden roles.

引用

页码：114 / 121

页数：8

共 25 条

[1] [Anonymous], 2006, Modification of UCT with patterns in Monte-Carlo go
[2] [Anonymous], 2009, P INT C AUTOMATED PL, DOI [10.5555/3037223.3037228, DOI 10.5555/3037223.3037228]
[3] Finite-time analysis of the multiarmed bandit problem
Auer, P
Cesa-Bianchi, N
Fischer, P
[J]. MACHINE LEARNING, 2002, 47 (2-3) : 235 - 256
[4] Borsboom Joris., 2007, Proc. BeNeLux Conf. Artif. Intell, P57
[5] A Survey of Monte Carlo Tree Search Methods
Browne, Cameron B.
Powley, Edward
Whitehouse, Daniel
Lucas, Simon M.
Cowling, Peter I.
Rohlfshagen, Philipp
Tavener, Stephen
Perez, Diego
Samothrakis, Spyridon
Colton, Simon
[J]. IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2012, 4 (01) : 1 - 43
[6] Buro M, 2009, 21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, P1407
[7] Monte Carlo tree search in Kriegspiel
Ciancarini, Paolo
Favini, Gian Piero
[J]. ARTIFICIAL INTELLIGENCE, 2010, 174 (11) : 670 - 684
[8] Cowling P. I., 2014, IEEE C EV COMP
[9] Information Set Monte Carlo Tree Search
Cowling, Peter I.
Powley, Edward J.
Whitehouse, Daniel
[J]. IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2012, 4 (02) : 120 - 143
[10] Search in games with incomplete information: a case study using Bridge card play
Frank, I
Basin, D
[J]. ARTIFICIAL INTELLIGENCE, 1998, 100 (1-2) : 87 - 123

← 1 2 3 →