Deep Reinforcement Learning Policy in Hex Game System

被引:0
|
作者
Lu, Mengxuan [1 ]
Li, Xuejun [1 ]
机构
[1] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Peoples R China
来源
PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC) | 2018年
关键词
Computer Game; Hex Game; Deep Reinforcement Learning; Actor-Critic A3C; GO;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hex game is a zero-sum chess game. It has a large solution space when using 11 x 11 size of chess board. In recent years, deep reinforcement learning -based Go game systems, i.e. AlphaGo and AlphaGo Zero, have gotten huge achievement. In this paper, we design the self-learning method and system structure of Hex game. design policy network and value network referred to residual network, and use asynchronous advantage actor-critic algorithm to train policy network and value network. The comparison of deep reinforcement learning-based policy network and fixed strategy proves better effect of self-learning.
引用
收藏
页码:6623 / 6626
页数:4
相关论文
共 50 条
  • [31] A Meta-Game Evaluation Framework for Deep Multiagent Reinforcement Learning
    Li, Zun
    Wellman, Michael P.
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 148 - 156
  • [32] Optimizing Policy via Deep Reinforcement Learning for Dialogue Management
    Xu, Guanghao
    Lee, Hyunjung
    Koo, Myoung-Wan
    Seo, Jungyun
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 582 - 589
  • [33] Continual learning, deep reinforcement learning, and microcircuits: a novel method for clever game playing
    Chang O.
    Ramos L.
    Morocho-Cayamcela M.E.
    Armas R.
    Zhinin-Vera L.
    Multimedia Tools and Applications, 2025, 84 (3) : 1537 - 1559
  • [34] Adaptable automation with modular deep reinforcement learning and policy transfer
    Raziei, Zohreh
    Moghaddam, Mohsen
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 103
  • [35] Scaling up Deep Reinforcement Learning for Intelligent Video Game Agents
    Debner, Anton
    2022 IEEE INTERNATIONAL CONFERENCE ON SMART COMPUTING (SMARTCOMP 2022), 2022, : 192 - 193
  • [36] Policy Return: A New Method for Reducing the Number of Experimental Trials in Deep Reinforcement Learning
    Liu, Feng
    Dai, Shuling
    Zhao, Yongjia
    IEEE ACCESS, 2020, 8 : 228099 - 228107
  • [37] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
    Morales, Eduardo F.
    Murrieta-Cid, Rafael
    Becerra, Israel
    Esquivel-Basaldua, Marco A.
    INTELLIGENT SERVICE ROBOTICS, 2021, 14 (05) : 773 - 805
  • [38] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
    Eduardo F. Morales
    Rafael Murrieta-Cid
    Israel Becerra
    Marco A. Esquivel-Basaldua
    Intelligent Service Robotics, 2021, 14 : 773 - 805
  • [39] An Off-Policy Trust Region Policy Optimization Method With Monotonic Improvement Guarantee for Deep Reinforcement Learning
    Meng, Wenjia
    Zheng, Qian
    Shi, Yue
    Pan, Gang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (05) : 2223 - 2235
  • [40] Power System Fault Diagnosis Method Based on Deep Reinforcement Learning
    Wang, Zirui
    Zhang, Ziqi
    Zhang, Xu
    Du, Mingxuan
    Zhang, Huiting
    Liu, Bowen
    ENERGIES, 2022, 15 (20)