A Hierarchical Reinforcement Learning Algorithm Based On Heuristic Reward Function

被引:5
|
作者
Yan, Qicui [1 ]
Liu, Quan [1 ]
Hu, Daojing [1 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Jiangsu, Peoples R China
关键词
hierarchical reinforcement learning; heuristic reward function; Tetris; curse of dimensionality;
D O I
10.1109/ICACC.2010.5486837
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A hierarchical reinforcement learning method based on heuristic reward function is proposed to solve the problem of "curse of dimensionality", that is the states space will grow exponentially in the number of features, and low convergence speed. The method can reduce state spaces greatly and can enhance the speed of the study. Choose actions with favorable purpose and efficiency so as to optimize reward function and enhance convergence speed. Apply this method to the Tetris game; the experiment result shows that the method can partly solve the "curse of dimensionality" and can enhance the convergence speed prominent.
引用
收藏
页码:371 / 376
页数:6
相关论文
共 50 条
  • [1] Potential Based Reward Shaping for Hierarchical Reinforcement Learning
    Gao, Yang
    Toni, Francesca
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 3504 - 3510
  • [2] Hierarchical average reward reinforcement learning
    Ghavamzadeh, Mohammad
    Mahadevan, Sridhar
    JOURNAL OF MACHINE LEARNING RESEARCH, 2007, 8 : 2629 - 2669
  • [3] Hierarchical average reward reinforcement learning
    Department of Computing Science, University of Alberta, Edmonton, Alta. T6G 2E8, Canada
    不详
    Journal of Machine Learning Research, 2007, 8 : 2629 - 2669
  • [4] A Modified Average Reward Reinforcement Learning Based on Fuzzy Reward Function
    Zhai, Zhenkun
    Chen, Wei
    Li, Xiong
    Guo, Jing
    IMECS 2009: INTERNATIONAL MULTI-CONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2009, : 113 - 117
  • [5] A Reward Optimization Method Based on Action Subrewards in Hierarchical Reinforcement Learning
    Fu, Yuchen
    Liu, Quan
    Ling, Xionghong
    Cui, Zhiming
    SCIENTIFIC WORLD JOURNAL, 2014,
  • [6] A Navigation Algorithm Based on the Reinforcement Learning Reward System and Optimised with Genetic Algorithm
    Cabezas-Olivenza, Mireya
    Zulueta, Ekaitz
    Azurmendi-Marquinez, Iker
    Fernandez-Gamiz, Unai
    Rico-Melgosa, Danel
    MATHEMATICS, 2024, 12 (24)
  • [7] Point Cloud Registration via Heuristic Reward Reinforcement Learning
    Chen, Bingren
    STATS, 2023, 6 (01): : 268 - 278
  • [8] Transfer in variable-reward hierarchical reinforcement learning
    Neville Mehta
    Sriraam Natarajan
    Prasad Tadepalli
    Alan Fern
    Machine Learning, 2008, 73 : 289 - 312
  • [9] Transfer in variable-reward hierarchical reinforcement learning
    Mehta, Neville
    Natarajan, Sriraam
    Tadepalli, Prasad
    Fern, Alan
    MACHINE LEARNING, 2008, 73 (03) : 289 - 312
  • [10] Meta-Reinforcement Learning Algorithm Based on Reward and Dynamic Inference
    Chen, Jinhao
    Zhang, Chunhong
    Hu, Zheng
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT III, PAKDD 2024, 2024, 14647 : 223 - 234