Hybrid Online and Offline Reinforcement Learning for Tibetan Jiu Chess

被引:2
|
作者
Li, Xiali [1 ]
Lv, Zhengyu [1 ]
Wu, Licheng [1 ]
Zhao, Yue [1 ]
Xu, Xiaona [1 ]
机构
[1] Minzu Univ China, Sch Informat & Engn, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
GAME; GO; NETWORKS; PROGRAM; SHOGI;
D O I
10.1155/2020/4708075
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
In this study, hybrid state-action-reward-state-action (SARSA lambda) and Q-learning algorithms are applied to different stages of an upper confidence bound applied to tree search for Tibetan Jiu chess. Q-learning is also used to update all the nodes on the search path when each game ends. A learning strategy that uses SARSA lambda and Q-learning algorithms combining domain knowledge for a feedback function for layout and battle stages is proposed. An improved deep neural network based on ResNet18 is used for self-play training. Experimental results show that hybrid online and offline reinforcement learning with a deep neural network can improve the game program's learning efficiency and understanding ability for Tibetan Jiu chess.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] A phased game algorithm combining deep reinforcement learning and UCT for Tibetan Jiu chess
    Li, Xiali
    Chen, Yandong
    Zhang, Yanyin
    Liu, Bo
    Wu, Licheng
    2023 IEEE 47TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC, 2023, : 390 - 395
  • [2] Hybrid Offline/Online Optimization for Energy Management via Reinforcement Learning
    Silvestri, Mattia
    De Filippo, Allegra
    Ruggeri, Federico
    Lombardi, Michele
    INTEGRATION OF CONSTRAINT PROGRAMMING, ARTIFICIAL INTELLIGENCE, AND OPERATIONS RESEARCH, CPAIOR 2022, 2022, 13292 : 358 - 373
  • [3] Control of Hybrid Electric Vehicle Powertrain Using Offline-Online Hybrid Reinforcement Learning
    Yao, Zhengyu
    Yoon, Hwan-Sik
    Hong, Yang-Ki
    ENERGIES, 2023, 16 (02)
  • [4] Offline Evaluation of Online Reinforcement Learning Algorithms
    Mandel, Travis
    Liu, Yun-En
    Brunskill, Emma
    Popovic, Zoran
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 1926 - 1933
  • [5] Efficient Online Reinforcement Learning with Offline Data
    Ball, Philip J.
    Smith, Laura
    Kostrikov, Ilya
    Levine, Sergey
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [6] Hybrid offline-online reinforcement learning for obstacle avoidance in autonomous underwater vehicles
    Zhao, Jintao
    Liu, Tao
    Huang, Junhao
    SHIPS AND OFFSHORE STRUCTURES, 2024,
  • [7] Enhancing UAV Aerial Docking: A Hybrid Approach Combining Offline and Online Reinforcement Learning
    Feng, Yuting
    Yang, Tao
    Yu, Yushu
    DRONES, 2024, 8 (05)
  • [8] Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
    Zheng, Han
    Luo, Xufang
    Wei, Pengfei
    Song, Xuan
    Li, Dongsheng
    Jiang, Jing
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 11372 - 11380
  • [9] Strategy research based on chess shapes for Tibetan JIU computer game
    Li, Xiali
    Wang, Song
    Lv, Zhengyu
    Li, Yongji
    Wu, Licheng
    ICGA JOURNAL, 2018, 40 (03) : 318 - 328
  • [10] Online gaming platform for Tibetan Jiu chess: AI-vs-AI and human-vs-human
    Xu, Gan
    Li, Xiali
    Zhang, YanYin
    2024 IEEE 48TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC 2024, 2024, : 1506 - 1507