Hierarchical Reinforcement Learning: A Comprehensive Survey

被引：242

作者：

Pateria, Shubham ^{[1
]}

Subagdja, Budhitama ^{[2
]}

Tan, Ah-hwee ^{[2
]}

Quek, Chai ^{[1
]}

机构：

[1] Nanyang Technol Univ, Sch Comp Sci & Engn, 50 Nanyang Ave, Singapore 639798, Singapore

[2] Singapore Management Univ, Sch Comp & Informat Syst, 80 Stamford Rd, Singapore 178902, Singapore

来源：

ACM COMPUTING SURVEYS | 2021年 / 54卷 / 05期

基金：

新加坡国家研究基金会;

关键词：

Hierarchical reinforcement learning; subtask discovery; skill discovery; hierarchical reinforcement learning survey; hierarchical reinforcement learning taxonomy; TEMPORAL ABSTRACTION; FRAMEWORK; OPTIONS;

D O I：

10.1145/3453160

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of challenging long-horizon decision-making tasks into simpler subtasks. During the past years, the landscape of HRL research has grown profoundly, resulting in copious approaches. A comprehensive overview of this vast landscape is necessary to study HRL in an organized manner. We provide a survey of the diverse HRL approaches concerning the challenges of learning hierarchical policies, subtask discovery, transfer learning, and multi-agent learning using HRL. The survey is presented according to a novel taxonomy of the approaches. Based on the survey, a set of important open problems is proposed to motivate the future research in HRL. Furthermore, we outline a few suitable task domains for evaluating the HRL approaches and a few interesting examples of the practical applications of HRL in the Supplementary Material.

引用

页数：35

共 104 条

[1]

Achiam J., 2018, ARXIV180710299

[2]

Ahilan S., 2019, ARXIV190108492

[3]

Al-Emran Mostafa, 2015, International Journal of Computing and Digital Systems, V4, P137, DOI 10.12785/ijcds/040207

[4]

[Anonymous], 2017, P INT C LEARN REPR

[5]

[Anonymous], 2018, ARXIV180311485

[6]

Bacon PL, 2017, AAAI CONF ARTIF INTE, P1726

[7]

Bagaria A., 2020, INT C LEARNING REPRE

[8]

Bakker B, 2004, Proceedings of the Second IASTED International Conference on Neural Networks and Computational Intelligence, P125

[9]

Barreto A, 2019, ADV NEUR IN, V32

[10]

Barto AG, 2003, DISCRETE EVENT DYN S, V13, P41, DOI [10.1023/A:1022140919877, 10.1023/A:1025696116075]

← 1 2 3 4 5 6 7 8 9 10 →