WagerWin: An Efficient Reinforcement Learning Framework for Gambling Games

被引：1

作者：

Wang, Haoli ^{[1
]}

Wu, Hejun ^{[1
]}

Lai, Guoming ^{[2
]}

机构：

[1] Sun Yat Sen Univ, Dept Comp Sci & Engn, Guangzhou 510275, Peoples R China

[2] Huizhou Univ, Sch Comp Sci & Engn, Huizhou 516007, Guangdong, Peoples R China

来源：

IEEE TRANSACTIONS ON GAMES | 2023年 / 15卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Games; Artificial intelligence; Training; Reinforcement learning; Training data; Monte Carlo methods; Law; Gambling games; game AI; reinforcement learning (RL); NETWORKS; POKER; GO;

D O I：

10.1109/TG.2022.3226526

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although reinforcement learning (RL) has achieved great success in diverse scenarios, complex gambling games still pose great challenges for RL. Common deep RL methods have difficulties maintaining stability and efficiency in such games. By theoretical analysis, we find that the return distribution of a gambling game is an intrinsic factor of this problem. Such return distribution of gambling games is partitioned into two parts, depending on the win/lose outcome. These two parts represent the gain and loss. They repel each other because the player keeps "raising," i.e., making a wager. However, common deep RL methods directly approximate the expectation of the return, without considering the particularity of the distribution. This way causes a redundant loss term in the objective function and a subsequent high variance. In this work, we propose WagerWin, a new framework for gambling games. WagerWin introduces probability and value factorization to construct a more effective value function. Our framework removes the redundant loss term of the objective function in training. In addition, WagerWin supports customized policy adaptation, which can tune the pretrained policy for different inclinations. We conduct extensive experiments on DouDizhu and SmallDou, a reduced version of DouDizhu. The results demonstrate that WagerWin outperforms the original state-of-the-art RL model in both training efficiency and stability.

引用

页码：483 / 491

页数：9

共 50 条

[1] PyTAG: Tabletop Games for Multiagent Reinforcement Learning
Balla, Martin
Long, George E. M.
Goodman, James
Gaina, Raluca D.
Perez-Liebana, Diego
IEEE TRANSACTIONS ON GAMES, 2024, 16 (04) : 993 - 1002
[2] Reinforcement learning applied to games
Crespo, Joao
Wichert, Andreas
SN APPLIED SCIENCES, 2020, 2 (05):
[3] DanZero plus : Dominating the GuanDan Game Through Reinforcement Learning
Zhao, Youpeng
Lu, Yudong
Zhao, Jian
Zhou, Wengang
Li, Houqiang
IEEE TRANSACTIONS ON GAMES, 2024, 16 (04) : 914 - 926
[4] Adjacency Constraint for Efficient Hierarchical Reinforcement Learning
Zhang, Tianren
Guo, Shangqi
Tan, Tian
Hu, Xiaolin
Chen, Feng
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 4152 - 4166
[5] Reinforcement learning applied to games
João Crespo
Andreas Wichert
SN Applied Sciences, 2020, 2
[6] Efficient Reinforcement Learning for StarCraft by Abstract Forward Models and Transfer Learning
Liu, Ruo-Ze
Guo, Haifeng
Ji, Xiaozhong
Yu, Yang
Pang, Zhen-Jia
Xiao, Zitai
Wu, Yuzhou
Lu, Tong
IEEE TRANSACTIONS ON GAMES, 2022, 14 (02) : 294 - 307
[7] Nash-Minmax Strategy for Multiplayer Multiagent Graphical Games With Reinforcement Learning
Lian, Bosen
Xue, Wenqian
Lewis, Frank L.
Davoudi, Ali
IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2025, 12 (01): : 763 - 775
[8] Sample Efficient Reinforcement Learning Method via High Efficient Episodic Memory
Yang, Dujia
Qin, Xiaowei
Xu, Xiaodong
Li, Chensheng
Wei, Guo
IEEE ACCESS, 2020, 8 : 129274 - 129284
[9] A Reinforcement Learning Framework for Efficient Informative Sensing
Wei, Yongyong
Zheng, Rong
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2022, 21 (07) : 2306 - 2317
[10] A modeling environment for reinforcement learning in games
Gomes, Gilzamir
Vidal, Creto A.
Cavalcante-Neto, Joaquim B.
Nogueira, Yuri L. B.
ENTERTAINMENT COMPUTING, 2022, 43

← 1 2 3 4 5 →