A Hierarchical Reinforcement Learning Algorithm Based On Heuristic Reward Function

被引：5

作者：

Yan, Qicui ^{[1
]}

Liu, Quan ^{[1
]}

Hu, Daojing ^{[1
]}

机构：

[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Jiangsu, Peoples R China

来源：

2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL (ICACC 2010), VOL. 3 | 2010年

关键词：

hierarchical reinforcement learning; heuristic reward function; Tetris; curse of dimensionality;

D O I：

10.1109/ICACC.2010.5486837

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A hierarchical reinforcement learning method based on heuristic reward function is proposed to solve the problem of "curse of dimensionality", that is the states space will grow exponentially in the number of features, and low convergence speed. The method can reduce state spaces greatly and can enhance the speed of the study. Choose actions with favorable purpose and efficiency so as to optimize reward function and enhance convergence speed. Apply this method to the Tetris game; the experiment result shows that the method can partly solve the "curse of dimensionality" and can enhance the convergence speed prominent.

引用

页码：371 / 376

页数：6

共 50 条

[1] Potential Based Reward Shaping for Hierarchical Reinforcement Learning
Gao, Yang
Toni, Francesca
PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 3504 - 3510
[2] Hierarchical average reward reinforcement learning
Ghavamzadeh, Mohammad
Mahadevan, Sridhar
JOURNAL OF MACHINE LEARNING RESEARCH, 2007, 8 : 2629 - 2669
[3] Hierarchical average reward reinforcement learning
Department of Computing Science, University of Alberta, Edmonton, Alta. T6G 2E8, Canada
不详
Journal of Machine Learning Research, 2007, 8 : 2629 - 2669
[4] A Modified Average Reward Reinforcement Learning Based on Fuzzy Reward Function
Zhai, Zhenkun
Chen, Wei
Li, Xiong
Guo, Jing
IMECS 2009: INTERNATIONAL MULTI-CONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2009, : 113 - 117
[5] A Reward Optimization Method Based on Action Subrewards in Hierarchical Reinforcement Learning
Fu, Yuchen
Liu, Quan
Ling, Xionghong
Cui, Zhiming
SCIENTIFIC WORLD JOURNAL, 2014,
[6] A Navigation Algorithm Based on the Reinforcement Learning Reward System and Optimised with Genetic Algorithm
Cabezas-Olivenza, Mireya
Zulueta, Ekaitz
Azurmendi-Marquinez, Iker
Fernandez-Gamiz, Unai
Rico-Melgosa, Danel
MATHEMATICS, 2024, 12 (24)
[7] Point Cloud Registration via Heuristic Reward Reinforcement Learning
Chen, Bingren
STATS, 2023, 6 (01): : 268 - 278
[8] Transfer in variable-reward hierarchical reinforcement learning
Neville Mehta
Sriraam Natarajan
Prasad Tadepalli
Alan Fern
Machine Learning, 2008, 73 : 289 - 312
[9] Transfer in variable-reward hierarchical reinforcement learning
Mehta, Neville
Natarajan, Sriraam
Tadepalli, Prasad
Fern, Alan
MACHINE LEARNING, 2008, 73 (03) : 289 - 312
[10] Meta-Reinforcement Learning Algorithm Based on Reward and Dynamic Inference
Chen, Jinhao
Zhang, Chunhong
Hu, Zheng
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT III, PAKDD 2024, 2024, 14647 : 223 - 234

← 1 2 3 4 5 →