Research on improving Mahjong model based on deep reinforcement learning

被引:0
作者
Wang, Yajie [1 ]
Wei, Zhihao [2 ]
Han, Shengyu [2 ]
Shi, Zhonghui [2 ]
机构
[1] Shenyang Aerosp Univ, Engn Training Ctr, Shenyang 110000, Liaoning, Peoples R China
[2] Shenyang Aerosp Univ, Shenyang 110000, Liaoning, Peoples R China
关键词
incomplete information game; Chinese public Mahjong; deep learning; reinforcement learning; GAME;
D O I
10.1504/IJCSM.2024.136829
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Mahjong is a popular incomplete information game. There are many scholars dedicated to Mahjong research. To improve the game ability of existing Mahjong models. A method based on deep learning and reinforcement learning is proposed. Firstly, a Mahjong program (MPRE) is designed. MPRE is used to generate training data for deep learning and as a comparison program for MPRE_RL, respectively. Secondly, with the feature extraction capability of deep learning, the game ability of MPRE is transformed into a deep learning model. Thirdly, the deep learning model is continuously improved by reinforcement learning. To improve the training speed and stability of reinforcement learning, some improvements are made in the environments and rewards. Finally, the results show that MPRE_RL improved by using the proposed method get a certain enhancement in offensive (27.1% of winning rate) and defensive (19.5% of win by discard rate) aspects compared with MPRE.
引用
收藏
页数:11
相关论文
共 21 条
[1]   Superhuman AI for heads-up no-limit poker: Libratus beats top professionals [J].
Brown, Noam ;
Sandholm, Tuomas .
SCIENCE, 2018, 359 (6374) :418-+
[2]   Deep blue [J].
Campbell, M ;
Hoane, AJ ;
Hsu, FH .
ARTIFICIAL INTELLIGENCE, 2002, 134 (1-2) :57-83
[3]  
Cheng Y, 2019, Arxiv, DOI arXiv:1707.07345
[4]  
Chuang L.K., 2015, National Chiao TungUniversity, V2015, P136
[5]  
Gao S., 2018, InformationProcessing Society of Japan, V2018
[6]  
Gao SQ, 2019, Arxiv, DOI arXiv:1906.02146
[7]  
Handa H, 2013, WOR CONG NAT BIOL, P147, DOI 10.1109/NaBIC.2013.6617853
[8]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[9]   Method for Constructing Artificial Intelligence Player With Abstractions to Markov Decision Processes in Multiplayer Game of Mahjong [J].
Kurita, Moyuru ;
Hoki, Kunihito .
IEEE TRANSACTIONS ON GAMES, 2021, 13 (01) :99-110
[10]  
Li JJ, 2020, Arxiv, DOI arXiv:2003.13590