A Hierarchical Reinforcement Learning Algorithm Based On Heuristic Reward Function

被引:5
|
作者
Yan, Qicui [1 ]
Liu, Quan [1 ]
Hu, Daojing [1 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Jiangsu, Peoples R China
关键词
hierarchical reinforcement learning; heuristic reward function; Tetris; curse of dimensionality;
D O I
10.1109/ICACC.2010.5486837
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A hierarchical reinforcement learning method based on heuristic reward function is proposed to solve the problem of "curse of dimensionality", that is the states space will grow exponentially in the number of features, and low convergence speed. The method can reduce state spaces greatly and can enhance the speed of the study. Choose actions with favorable purpose and efficiency so as to optimize reward function and enhance convergence speed. Apply this method to the Tetris game; the experiment result shows that the method can partly solve the "curse of dimensionality" and can enhance the convergence speed prominent.
引用
收藏
页码:371 / 376
页数:6
相关论文
共 50 条
  • [21] Fear based Intrinsic Reward as a Barrier Function for Continuous Reinforcement Learning
    Sanchez, Rodney
    Sahin, Ferat
    Heard, Jamison
    2024 19TH ANNUAL SYSTEM OF SYSTEMS ENGINEERING CONFERENCE, SOSE 2024, 2024, : 140 - 146
  • [22] Hierarchical Reinforcement Learning from Demonstration via Reachability-Based Reward Shaping
    Gao, Xiaozhu
    Liu, Jinhui
    Wan, Bo
    An, Lingling
    NEURAL PROCESSING LETTERS, 2024, 56 (03)
  • [23] An Average-Reward Reinforcement Learning Algorithm based on Schweitzer's Transformation
    Li Jianjun
    Ren Jiangong
    Li Yanjie
    PROCEEDINGS OF THE 31ST CHINESE CONTROL CONFERENCE, 2012, : 2966 - 2970
  • [24] An adaptive heuristic algorithm based on reinforcement learning for ship scheduling optimization problem
    Li, Runfo
    Zhang, Xinyu
    Jiang, Lingling
    Yang, Zaili
    Guo, Wenqiang
    OCEAN & COASTAL MANAGEMENT, 2022, 230
  • [25] Hierarchical reinforcement learning algorithm based on structural state-space
    Meng, Jiang-Hua
    Zhu, Ji-Hong
    Sun, Zeng-Qi
    Kongzhi yu Juece/Control and Decision, 2007, 22 (02): : 233 - 237
  • [26] Automatic Network Traffic Scheduling Algorithm Based on Hierarchical Reinforcement Learning
    He, Huiling
    Informatica (Slovenia), 2024, 48 (22): : 163 - 178
  • [27] A general assembly sequence planning algorithm based on hierarchical reinforcement learning
    Zhao M.-H.
    Zhang X.-B.
    Guo X.
    Ou Y.-S.
    Kongzhi yu Juece/Control and Decision, 2022, 37 (04): : 861 - 870
  • [28] An Analysis of Feature Selection and Reward Function for Model-Based Reinforcement Learning
    Shen, Shitian
    Lin, Chen
    Mostafavi, Behrooz
    Barnes, Tiffany
    Chi, Min
    INTELLIGENT TUTORING SYSTEMS, ITS 2016, 2016, 9684 : 504 - 505
  • [29] Effective Reward Function in Discernment Behavior Reinforcement Learning based on Categorization Progress
    Kim, Chyon Hae
    Kon, Yusuke
    Navarro, Ricardo
    Gouko, Manabu
    Kobayashi, Yuichi
    2016 IEEE-RAS 16TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2016, : 300 - 305
  • [30] Reinforcement learning with optimized reward function for stealth applications
    Mendonca, Matheus R. F.
    Bernardino, Heder S.
    Neto, Raul Fonseca
    ENTERTAINMENT COMPUTING, 2018, 25 : 37 - 47