Research and Development on Deep Hierarchical Reinforcement Learning

被引:0
作者
Huang Z.-G. [1 ]
Liu Q. [1 ,2 ,3 ,4 ]
Zhang L.-H. [1 ]
Cao J.-Q. [1 ]
Zhu F. [1 ,2 ,3 ,4 ]
机构
[1] School of Computer Science and Technology, Soochow University, Suzhou
[2] Jiangsu Key Laboratory for Computer Information Processing Technology (Soochow University), Suzhou
[3] Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education (Jilin University), Changchun
[4] Collaborative Innovation Center of Novel Software Technology and Industrialization (Nanjing), Nanjing
来源
Ruan Jian Xue Bao/Journal of Software | 2023年 / 34卷 / 02期
关键词
artificial intelligence; deep hierarchical reinforcement learning; deep reinforcement learning; reinforcement learning; semi-Markov decision process;
D O I
10.13328/j.cnki.jos.006706
中图分类号
学科分类号
摘要
Deep hierarchical reinforcement learning (DHRL) is an important research field in deep reinforcement learning (DRL). It focuses on sparse reward, sequential decision, and weak transfer ability problems, which are difficult to be solved by classic DRL. DHRL decomposes complex problems and constructs a multi-layered structure for DRL strategies based on hierarchical thinking. By using temporal abstraction, DHRL combines lower-level actions to learn semantic higher-level actions. In recent years, with the development of research, DHRL has been able to make breakthroughs in many domains and shows a strong performance. It has been applied to visual navigation, natural language processing, recommendation system and video description generation fields in real world. In this study, the theoretical basis of hierarchical reinforcement learning (HRL) is firstly introduced. Secondly, the key technologies of DHRL are described, including hierarchical abstraction techniques and common experimental environments. Thirdly, taking the option-based deep hierarchical reinforcement learning framework (O-DHRL) and the subgoal-based deep hierarchical reinforcement learning framework (G-DHRL) as the main research objects, those research status and development trend of various algorithms are analyzed and compared in detail. In addition, a number of DHRL applications in real world are discussed. Finally, DHRL is prospected and summarized. © 2023 Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:733 / 760
页数:27
相关论文
共 152 条
  • [1] Sutton RS, Barto AG., Reinforcement Learning: An Introduction, (2018)
  • [2] Goodfellow I, Bengio Y, Courville A, Bengio Y., Deep Learning, (2016)
  • [3] Liu Q, Zhai JW, Zhang ZZ, Zhong S, Zhou Q, Zhang P, Xu J., A survey on deep reinforcement learning, Chinese Journal of Computers, 41, 1, pp. 1-27, (2018)
  • [4] Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G., Human-level control through deep reinforcement learning, Nature, 518, 7540, pp. 529-533, (2015)
  • [5] Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D., Continuous control with deep reinforcement learning, Proc. of the Int’l Conf. on Learning Representations, (2016)
  • [6] Babaeizadeh M, Frosio I, Tyree S, Clemons J, Kautz J., Reinforcement learning through asynchronous advantage actor-critic on a GPU, Proc. of the Int’l Conf. on Learning Representations, (2017)
  • [7] Lai J, Wei JY, Chen XL., Overview of hierarchical reinforcement learning, Computer Engineering and Applications, 57, 3, pp. 72-79, (2021)
  • [8] Sutton RS, Precup D, Singh S., Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning, Artificial Intelligence, 112, 1-2, pp. 181-211, (1999)
  • [9] Kulkarni TD, Narasimhan K, Saeedi A, Tenenbaum J., Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation, Advances in Neural Information Processing Systems, pp. 3675-3683, (2016)
  • [10] Tang H, Hao J, Lv T, Chen Y, Zhang Z, Jia H, Ren C, Zheng Y, Meng Z, Fan C., Hierarchical deep multiagent reinforcement learning with temporal abstraction, (2018)