Scalable sub-game solving for imperfect-information games

被引：1

作者：

Li, Huale ^{[1
]}

Wang, Xuan ^{[1
,2
]}

Li, Kunchi ^{[1
]}

Jia, Fengwei ^{[1
]}

Wu, Yulin ^{[1
]}

Zhang, Jiajia ^{[1
]}

Qi, Shuhan ^{[1
,2
]}

机构：

[1] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen 518055, Peoples R China

[2] Peng Cheng Lab, Shenzhen 518038, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2021年 / 231卷

基金：

中国国家自然科学基金;

关键词：

Game; Counterfactual regret minimization; Imperfect-information; Agent; GO;

D O I：

10.1016/j.knosys.2021.107434

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Counterfactual regret minimization (CFR) is a popular and effective method for solving a game with imperfect information. The effect of CFR is limited by the size of the game state space. With the increase in the number of game participants, the game state space will increase rapidly. Although the vanilla CFR is suitable for two-player imperfect-information games, it does not work well in imperfect-information games with three or more players. In this paper, we design a framework for imperfect-information games, which can not only deal with two-player imperfect-information games but also can efficiently solve three-player imperfect-information games. Compared with traditional solving methods, in this framework we propose real-time hand abstraction (RTHA), which can reduce the error caused by the abstraction. We also propose a warm-start online solution of sub-game (WSOS-SG) method, which can improve the accuracy of the action estimation and solve the sub-game in real time. Experimental results show that the agent based on our method achieve better performances than traditional methods. The agent based on our method took part in the 2018 AAAI-ACPC poker competition and won third place in heads-up no-limit Texas hold'em. (C) 2021 Elsevier B.V. All rights reserved.

引用

页数：11

共 40 条

[1]

[Anonymous], 2017, INT C MACHINE LEARNI

[2]

[Anonymous], 2013, P NEURIPS DEEP LEARN

[3] The challenge of poker [J].

Billings, D ;

Davidson, A ;

Schaeffer, J ;

Szafron, D .

ARTIFICIAL INTELLIGENCE, 2002, 134 (1-2) :201-240

[4]

Billings D., 2003, INT JOINT C ART INT

[5]

Brown N., 2019, INT C MACH LEARN

[6]

Brown N, 2019, AAAI CONF ARTIF INTE, P1829

[7] Superhuman AI for multiplayer poker [J].

Brown, Noam ;

Sandholm, Tuomas .

SCIENCE, 2019, 365 (6456) :885-+

[8] Superhuman AI for heads-up no-limit poker: Libratus beats top professionals [J].

Brown, Noam ;

Sandholm, Tuomas .

SCIENCE, 2018, 359 (6374) :418-+

[9]

Brown Noam, 2017, ADV NEURAL INFORM PR, P689

[10] A Survey of Monte Carlo Tree Search Methods [J].

Browne, Cameron B. ;

Powley, Edward ;

Whitehouse, Daniel ;

Lucas, Simon M. ;

Cowling, Peter I. ;

Rohlfshagen, Philipp ;

Tavener, Stephen ;

Perez, Diego ;

Samothrakis, Spyridon ;

Colton, Simon .

IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2012, 4 (01) :1-43

← 1 2 3 4 →