WagerWin: An Efficient Reinforcement Learning Framework for Gambling Games

被引:1
|
作者
Wang, Haoli [1 ]
Wu, Hejun [1 ]
Lai, Guoming [2 ]
机构
[1] Sun Yat Sen Univ, Dept Comp Sci & Engn, Guangzhou 510275, Peoples R China
[2] Huizhou Univ, Sch Comp Sci & Engn, Huizhou 516007, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Games; Artificial intelligence; Training; Reinforcement learning; Training data; Monte Carlo methods; Law; Gambling games; game AI; reinforcement learning (RL); NETWORKS; POKER; GO;
D O I
10.1109/TG.2022.3226526
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although reinforcement learning (RL) has achieved great success in diverse scenarios, complex gambling games still pose great challenges for RL. Common deep RL methods have difficulties maintaining stability and efficiency in such games. By theoretical analysis, we find that the return distribution of a gambling game is an intrinsic factor of this problem. Such return distribution of gambling games is partitioned into two parts, depending on the win/lose outcome. These two parts represent the gain and loss. They repel each other because the player keeps "raising," i.e., making a wager. However, common deep RL methods directly approximate the expectation of the return, without considering the particularity of the distribution. This way causes a redundant loss term in the objective function and a subsequent high variance. In this work, we propose WagerWin, a new framework for gambling games. WagerWin introduces probability and value factorization to construct a more effective value function. Our framework removes the redundant loss term of the objective function in training. In addition, WagerWin supports customized policy adaptation, which can tune the pretrained policy for different inclinations. We conduct extensive experiments on DouDizhu and SmallDou, a reduced version of DouDizhu. The results demonstrate that WagerWin outperforms the original state-of-the-art RL model in both training efficiency and stability.
引用
收藏
页码:483 / 491
页数:9
相关论文
共 50 条
  • [11] A Deep Reinforcement Learning-Based Framework for PolSAR Imagery Classification
    Nie, Wen
    Huang, Kui
    Yang, Jie
    Li, Pingxiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [12] Towards sample efficient deep reinforcement learning in collectible card games
    Vieira, Ronaldo e Silva
    Tavares, Anderson Rocha
    Chaimowicz, Luiz
    ENTERTAINMENT COMPUTING, 2023, 47
  • [13] Parallelization of Reinforcement Learning Algorithms for Video Games
    Kopel, Marek
    Szczurek, Witold
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2021, 2021, 12672 : 195 - 207
  • [14] Distilling Reinforcement Learning Tricks for Video Games
    Kanervisto, Anssi
    Scheller, Christian
    Schraner, Yanick
    Hautamaki, Ville
    2021 IEEE CONFERENCE ON GAMES (COG), 2021, : 1088 - 1091
  • [15] Reinforcement learning in population games
    Lahkar, Ratul
    Seymour, Robert M.
    GAMES AND ECONOMIC BEHAVIOR, 2013, 80 : 10 - 38
  • [16] An Efficient Reinforcement Learning Based Framework for Exploring Logic Synthesis
    Qian, Yu
    Zhou, Xuegong
    Zhou, Hao
    Wang, Lingli
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2024, 29 (02)
  • [17] Efficient Parallel Reinforcement Learning Framework Using the Reactor Model
    Kwok, Jacky
    Lohstroh, Marten
    Lee, Edward A.
    PROCEEDINGS OF THE 36TH ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, SPAA 2024, 2024, : 41 - 51
  • [18] An Efficient Framework for Personalizing EMG-Driven Musculoskeletal Models Based on Reinforcement Learning
    Berman, Joseph
    Lee, I-Chieh
    Yin, Jie
    Huang, He
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2024, 32 : 4174 - 4185
  • [19] Provably Efficient Reinforcement Learning in Decentralized General-Sum Markov Games
    Mao, Weichao
    Basar, Tamer
    DYNAMIC GAMES AND APPLICATIONS, 2023, 13 (01) : 165 - 186
  • [20] Optimal Group Consensus of Multiagent Systems in Graphical Games Using Reinforcement Learning
    Wang, Yuhan
    Wang, Zhuping
    Zhang, Hao
    Yan, Huaicheng
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2025, 55 (03): : 2343 - 2353