Deriving the Optimal Strategy for the Two Dice Pig Game via Reinforcement Learning

被引:2
作者
Zhu, Tian [1 ]
Ma, Merry H. [2 ]
机构
[1] SUNY Stony Brook, Dept Appl Math & Stat, Stony Brook, NY 11794 USA
[2] Stony Brook Sch, 1 Chapman Pkwy, Stony Brook, NY 11790 USA
来源
STATS | 2022年 / 5卷 / 03期
关键词
dynamic programming; game theory; Markov decision process; optimization; two-dice pig game; value iteration; PROGRAM;
D O I
10.3390/stats5030047
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Games of chance have historically played a critical role in the development and teaching of probability theory and game theory, and, in the modern age, computer programming and reinforcement learning. In this paper, we derive the optimal strategy for playing the two-dice game Pig, both the standard version and its variant with doubles, coined "Double-Trouble", using certain fundamental concepts of reinforcement learning, especially the Markov decision process and dynamic programming. We further compare the newly derived optimal strategy to other popular play strategies in terms of the winning chances and the order of play. In particular, we compare to the popular "hold at n" strategy, which is considered to be close to the optimal strategy, especially for the best n, for each type of Pig Game. For the standard two-player, two-dice, sequential Pig Game examined here, we found that "hold at 23" is the best choice, with the average winning chance against the optimal strategy being 0.4747. For the "Double-Trouble" version, we found that the "hold at 18" is the best choice, with the average winning chance against the optimal strategy being 0.4733. Furthermore, time in terms of turns to play each type of game is also examined for practical purposes. For optimal vs. optimal or optimal vs. the best "hold at n" strategy, we found that the average number of turns is 19, 23, and 24 for one-die Pig, standard two-dice Pig, and the "Double-Trouble" two-dice Pig games, respectively. We hope our work will inspire students of all ages to invest in the field of reinforcement learning, which is crucial for the development of artificial intelligence and robotics and, subsequently, for the future of humanity.
引用
收藏
页码:805 / 818
页数:14
相关论文
共 28 条
  • [1] [Anonymous], 2010, UMAP J.
  • [2] BACKGAMMON COMPUTER-PROGRAM BEATS WORLD CHAMPION
    BERLINER, HJ
    [J]. ARTIFICIAL INTELLIGENCE, 1980, 14 (02) : 205 - 220
  • [3] Bonavent Dagobert E., 1946, The Mathematics Teacher, V39, P155, DOI 10.5951/MT.39.4.0155
  • [4] Active learning with Monty Hall in a game theory class
    Brokaw, AJ
    Merz, TE
    [J]. JOURNAL OF ECONOMIC EDUCATION, 2004, 35 (03) : 259 - 268
  • [5] Immersive machine learning for social attitude detection in virtual reality narrative games
    Dobre, Georgiana Cristina
    Gillies, Marco
    Pan, Xueni
    [J]. VIRTUAL REALITY, 2022, 26 (04) : 1519 - 1538
  • [6] Elliott N.L., 1973, THESIS U N CAROLINA
  • [7] Intelligent career planning via stochastic subsampling reinforcement learning
    Guo, Pengzhan
    Xiao, Keli
    Ye, Zeyang
    Zhu, Hengshu
    Zhu, Wei
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [8] Route Optimization via Environment-Aware Deep Network and Reinforcement Learning
    Guo, Pengzhan
    Xiao, Keli
    Ye, Zeyang
    Zhu, Wei
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2021, 12 (06)
  • [9] Applications of game theory in deep learning: a survey
    Hazra, Tanmoy
    Anjaria, Kushal
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (06) : 8963 - 8994
  • [10] Effectiveness of game development-based learning for acquiring programming skills in lower secondary education in Croatia
    Holenko Dlab, Martina
    Hoic-Bozic, Natasa
    [J]. EDUCATION AND INFORMATION TECHNOLOGIES, 2021, 26 (04) : 4433 - 4456