Enhanced Reinforcement Learning Method Combining One-Hot Encoding-Based Vectors for CNN-Based Alternative High-Level Decisions

被引:19
作者
Gu, Bonwoo [1 ]
Sung, Yunsick [2 ]
机构
[1] SIMNET Cooperat, Dept M&S, Daejeon 34127, South Korea
[2] Dongguk Univ Seoul, Dept Multimedia Engn, Seoul 04620, South Korea
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 03期
基金
新加坡国家研究基金会;
关键词
gomoku; game artificial intelligence; convolutional neural-networks; one-hot encoding; reinforcement learning;
D O I
10.3390/app11031291
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Gomoku is a two-player board game that originated in ancient China. There are various cases of developing Gomoku using artificial intelligence, such as a genetic algorithm and a tree search algorithm. Alpha-Gomoku, Gomoku AI built with Alpha-Go's algorithm, defines all possible situations in the Gomoku board using Monte-Carlo tree search (MCTS), and minimizes the probability of learning other correct answers in the duplicated Gomoku board situation. However, in the tree search algorithm, the accuracy drops, because the classification criteria are manually set. In this paper, we propose an improved reinforcement learning-based high-level decision approach using convolutional neural networks (CNN). The proposed algorithm expresses each state as One-Hot Encoding based vectors and determines the state of the Gomoku board by combining the similar state of One-Hot Encoding based vectors. Thus, in a case where a stone that is determined by CNN has already been placed or cannot be placed, we suggest a method for selecting an alternative. We verify the proposed method of Gomoku AI in GuPyEngine, a Python-based 3D simulation platform.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 34 条
  • [1] Allis L.V., 1993, GO MOKU THREAT SPACE
  • [2] [Anonymous], 2017, ARXIV171206560
  • [3] [Anonymous], 2007, ARXIVCS0703062
  • [4] [Anonymous], 2018, ARXIV180602308
  • [5] [Anonymous], INTRO CONVOLUTIONAL
  • [6] Bradtke SJ, 1996, MACH LEARN, V22, P33, DOI 10.1007/BF00114723
  • [7] A Survey of Monte Carlo Tree Search Methods
    Browne, Cameron B.
    Powley, Edward
    Whitehouse, Daniel
    Lucas, Simon M.
    Cowling, Peter I.
    Rohlfshagen, Philipp
    Tavener, Stephen
    Perez, Diego
    Samothrakis, Spyridon
    Colton, Simon
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2012, 4 (01) : 1 - 43
  • [8] Similarity encoding for learning with dirty categorical variables
    Cerda, Patricio
    Varoquaux, Gael
    Kegl, Balazs
    [J]. MACHINE LEARNING, 2018, 107 (8-10) : 1477 - 1494
  • [9] On the bottleneck tree alignment problems
    Chen, Yen Hung
    Tang, Chuan Yi
    [J]. INFORMATION SCIENCES, 2010, 180 (11) : 2134 - 2141
  • [10] Colledanchise M., 2018, Behavior Trees in Robotics and AI: An Introduction