Deriving the Optimal Strategy for the Two Dice Pig Game via Reinforcement Learning

被引：2

作者：

Zhu, Tian ^{[1
]}

Ma, Merry H. ^{[2
]}

机构：

[1] SUNY Stony Brook, Dept Appl Math & Stat, Stony Brook, NY 11794 USA

[2] Stony Brook Sch, 1 Chapman Pkwy, Stony Brook, NY 11790 USA

来源：

STATS | 2022年 / 5卷 / 03期

关键词：

dynamic programming; game theory; Markov decision process; optimization; two-dice pig game; value iteration; PROGRAM;

D O I：

10.3390/stats5030047

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

Games of chance have historically played a critical role in the development and teaching of probability theory and game theory, and, in the modern age, computer programming and reinforcement learning. In this paper, we derive the optimal strategy for playing the two-dice game Pig, both the standard version and its variant with doubles, coined "Double-Trouble", using certain fundamental concepts of reinforcement learning, especially the Markov decision process and dynamic programming. We further compare the newly derived optimal strategy to other popular play strategies in terms of the winning chances and the order of play. In particular, we compare to the popular "hold at n" strategy, which is considered to be close to the optimal strategy, especially for the best n, for each type of Pig Game. For the standard two-player, two-dice, sequential Pig Game examined here, we found that "hold at 23" is the best choice, with the average winning chance against the optimal strategy being 0.4747. For the "Double-Trouble" version, we found that the "hold at 18" is the best choice, with the average winning chance against the optimal strategy being 0.4733. Furthermore, time in terms of turns to play each type of game is also examined for practical purposes. For optimal vs. optimal or optimal vs. the best "hold at n" strategy, we found that the average number of turns is 19, 23, and 24 for one-die Pig, standard two-dice Pig, and the "Double-Trouble" two-dice Pig games, respectively. We hope our work will inspire students of all ages to invest in the field of reinforcement learning, which is crucial for the development of artificial intelligence and robotics and, subsequently, for the future of humanity.

引用

页码：805 / 818

页数：14

共 28 条

[1] [Anonymous], 2010, UMAP J.
[2] BACKGAMMON COMPUTER-PROGRAM BEATS WORLD CHAMPION
BERLINER, HJ
[J]. ARTIFICIAL INTELLIGENCE, 1980, 14 (02) : 205 - 220
[3] Bonavent Dagobert E., 1946, The Mathematics Teacher, V39, P155, DOI 10.5951/MT.39.4.0155
[4] Active learning with Monty Hall in a game theory class
Brokaw, AJ
Merz, TE
[J]. JOURNAL OF ECONOMIC EDUCATION, 2004, 35 (03) : 259 - 268
[5] Immersive machine learning for social attitude detection in virtual reality narrative games
Dobre, Georgiana Cristina
Gillies, Marco
Pan, Xueni
[J]. VIRTUAL REALITY, 2022, 26 (04) : 1519 - 1538
[6] Elliott N.L., 1973, THESIS U N CAROLINA
[7] Intelligent career planning via stochastic subsampling reinforcement learning
Guo, Pengzhan
Xiao, Keli
Ye, Zeyang
Zhu, Hengshu
Zhu, Wei
[J]. SCIENTIFIC REPORTS, 2022, 12 (01)
[8] Route Optimization via Environment-Aware Deep Network and Reinforcement Learning
Guo, Pengzhan
Xiao, Keli
Ye, Zeyang
Zhu, Wei
[J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2021, 12 (06)
[9] Applications of game theory in deep learning: a survey
Hazra, Tanmoy
Anjaria, Kushal
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (06) : 8963 - 8994
[10] Effectiveness of game development-based learning for acquiring programming skills in lower secondary education in Croatia
Holenko Dlab, Martina
Hoic-Bozic, Natasa
[J]. EDUCATION AND INFORMATION TECHNOLOGIES, 2021, 26 (04) : 4433 - 4456

← 1 2 3 →