Double Q-learning Agent for Othello Board Game

被引:0
作者
Somasundaram, Thamarai Selvi [1 ]
Panneerselvam, Karthikeyan [1 ]
Bhuthapuri, Tarun
Mahadevan, Harini
Jose, Ashik
机构
[1] Madras Inst Technol, Dept Comp Technol, Chennai, Tamil Nadu, India
来源
2018 10TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC) | 2018年
关键词
Double Q-learning; Othello; reinforcement learning; GO;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper presents the first application of the Double Q-learning algorithm to the game of Othello. Reinforcement learning has previously been successfully applied to Othello using the canonical reinforcement learning algorithms, Q-learning and TD-learning. However, the algorithms suffer from considerable drawbacks. Q-learning frequently tends to be overoptimistic during evaluation, while TD-learning can get stuck in local minima. To overcome the disadvantages of the existing work, we propose using a Double Q-learning agent to play Othello and prove that it performs better than the existing learning agents. In addition to developing and implementing the Double Q-learning agent, we implement the Q-learning and TD-learning agents. The agents are trained and tested against two fixed opponents: a random player and a heuristic player. The performance of the Double Q-learning agent is compared with performance of the existing learning agents. The Double Q-learning agent outperforms them, although it takes longer, on average, to make each move. Further, we show that the Double Q-learning agent performs at its best with two hidden layers using the tanh function.
引用
收藏
页码:216 / 223
页数:8
相关论文
共 17 条
  • [1] [Anonymous], P 2006 IEEE S COMP I
  • [2] [Anonymous], 2013, Playing atari with deep reinforcement learning
  • [3] [Anonymous], CONNECTION SCI
  • [4] Hasselt H., 2010, 23 ADV NEUR INF PROC, V23, P2613
  • [5] Lucas S. M., 2008, Australian J. Intell. Inf. Process., V4, P1
  • [6] DeepStack: Expert-level artificial intelligence in heads-up no-limit poker
    Moravcik, Matej
    Schmid, Martin
    Burch, Neil
    Lisy, Viliam
    Morrill, Dustin
    Bard, Nolan
    Davis, Trevor
    Waugh, Kevin
    Johanson, Michael
    Bowling, Michael
    [J]. SCIENCE, 2017, 356 (6337) : 508 - +
  • [7] SAMUEL AL, 1959, IBM J RES DEV, V3, P211, DOI 10.1147/rd.441.0206
  • [8] Shannon C E., 1950, London, V41, P256, DOI [10.1080/14786445008521796, DOI 10.1080/14786445008521796]
  • [9] Mastering the game of Go without human knowledge
    Silver, David
    Schrittwieser, Julian
    Simonyan, Karen
    Antonoglou, Ioannis
    Huang, Aja
    Guez, Arthur
    Hubert, Thomas
    Baker, Lucas
    Lai, Matthew
    Bolton, Adrian
    Chen, Yutian
    Lillicrap, Timothy
    Hui, Fan
    Sifre, Laurent
    van den Driessche, George
    Graepel, Thore
    Hassabis, Demis
    [J]. NATURE, 2017, 550 (7676) : 354 - +
  • [10] Mastering the game of Go with deep neural networks and tree search
    Silver, David
    Huang, Aja
    Maddison, Chris J.
    Guez, Arthur
    Sifre, Laurent
    van den Driessche, George
    Schrittwieser, Julian
    Antonoglou, Ioannis
    Panneershelvam, Veda
    Lanctot, Marc
    Dieleman, Sander
    Grewe, Dominik
    Nham, John
    Kalchbrenner, Nal
    Sutskever, Ilya
    Lillicrap, Timothy
    Leach, Madeleine
    Kavukcuoglu, Koray
    Graepel, Thore
    Hassabis, Demis
    [J]. NATURE, 2016, 529 (7587) : 484 - +