Deep Reinforcement Learning Policy in Hex Game System

被引：0

作者：

Lu, Mengxuan ^{[1
]}

Li, Xuejun ^{[1
]}

机构：

[1] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Peoples R China

来源：

PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC) | 2018年

关键词：

Computer Game; Hex Game; Deep Reinforcement Learning; Actor-Critic A3C; GO;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Hex game is a zero-sum chess game. It has a large solution space when using 11 x 11 size of chess board. In recent years, deep reinforcement learning -based Go game systems, i.e. AlphaGo and AlphaGo Zero, have gotten huge achievement. In this paper, we design the self-learning method and system structure of Hex game. design policy network and value network referred to residual network, and use asynchronous advantage actor-critic algorithm to train policy network and value network. The comparison of deep reinforcement learning-based policy network and fixed strategy proves better effect of self-learning.

引用

页码：6623 / 6626

页数：4

共 50 条

[31] A Meta-Game Evaluation Framework for Deep Multiagent Reinforcement Learning
Li, Zun
Wellman, Michael P.
PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 148 - 156
[32] Optimizing Policy via Deep Reinforcement Learning for Dialogue Management
Xu, Guanghao
Lee, Hyunjung
Koo, Myoung-Wan
Seo, Jungyun
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 582 - 589
[33] Continual learning, deep reinforcement learning, and microcircuits: a novel method for clever game playing
Chang O.
Ramos L.
Morocho-Cayamcela M.E.
Armas R.
Zhinin-Vera L.
Multimedia Tools and Applications, 2025, 84 (3) : 1537 - 1559
[34] Adaptable automation with modular deep reinforcement learning and policy transfer
Raziei, Zohreh
Moghaddam, Mohsen
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 103
[35] Scaling up Deep Reinforcement Learning for Intelligent Video Game Agents
Debner, Anton
2022 IEEE INTERNATIONAL CONFERENCE ON SMART COMPUTING (SMARTCOMP 2022), 2022, : 192 - 193
[36] Policy Return: A New Method for Reducing the Number of Experimental Trials in Deep Reinforcement Learning
Liu, Feng
Dai, Shuling
Zhao, Yongjia
IEEE ACCESS, 2020, 8 : 228099 - 228107
[37] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
Morales, Eduardo F.
Murrieta-Cid, Rafael
Becerra, Israel
Esquivel-Basaldua, Marco A.
INTELLIGENT SERVICE ROBOTICS, 2021, 14 (05) : 773 - 805
[38] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
Eduardo F. Morales
Rafael Murrieta-Cid
Israel Becerra
Marco A. Esquivel-Basaldua
Intelligent Service Robotics, 2021, 14 : 773 - 805
[39] An Off-Policy Trust Region Policy Optimization Method With Monotonic Improvement Guarantee for Deep Reinforcement Learning
Meng, Wenjia
Zheng, Qian
Shi, Yue
Pan, Gang
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (05) : 2223 - 2235
[40] Power System Fault Diagnosis Method Based on Deep Reinforcement Learning
Wang, Zirui
Zhang, Ziqi
Zhang, Xu
Du, Mingxuan
Zhang, Huiting
Liu, Bowen
ENERGIES, 2022, 15 (20)

← 1 2 3 4 5 →