Hierarchical Reinforcement Learning with Options and United Neural Network Approximation

被引:2
作者
Kuzmin, Vadim [1 ]
Panov, Aleksandr I. [2 ,3 ]
机构
[1] Natl Res Univ Higher Sch Econ, Moscow, Russia
[2] Moscow Inst Phys & Technol, Moscow, Russia
[3] Russian Acad Sci, Fed Res Ctr Comp Sci & Control, Moscow, Russia
来源
PROCEEDINGS OF THE THIRD INTERNATIONAL SCIENTIFIC CONFERENCE INTELLIGENT INFORMATION TECHNOLOGIES FOR INDUSTRY (IITI'18), VOL 1 | 2019年 / 874卷
基金
俄罗斯科学基金会;
关键词
Hierarchical reinforcement learning; Options; Neural network; DQN; Deep neural network; Q-learning;
D O I
10.1007/978-3-030-01818-4_45
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The "curse of dimensionality" and environments with sparse delayed rewards are one of the main challenges in reinforcement learning (RL). To tackle these problems we can use hierarchical reinforcement learning (HRL) that provides abstraction both on actions and states of the environment. This work proposes an algorithm that combines hierarchical approach for RL and the ability of neural networks to serve as universal function approximators. To perform the hierarchy of actions the options framework is used which main idea is to utilize macro-actions (the sequence of simpler actions). State of the environment is the input to a convolutional neural network that plays a role of Q-function estimating the utility of every possible action and skill in the given state. We learn each option separately using different neural networks and then combine result into one architecture with top-level approximator. We compare the performance of the proposed algorithm with the deep Q-network algorithm (DQN) in the environment where the aim of the magnet-arm robot is to build a tower from bricks.
引用
收藏
页码:453 / 462
页数:10
相关论文
共 9 条
[1]  
[Anonymous], 1999, ARTIFICIAL INTELLIGE
[2]  
[Anonymous], ADV NEURAL INFORM PR
[3]  
Bacon P.-L., 2016, ARXIV160905140V2
[4]  
Bai A, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1418
[5]   Hierarchical reinforcement learning and decision making [J].
Botvinick, Matthew Michael .
CURRENT OPINION IN NEUROBIOLOGY, 2012, 22 (06) :956-962
[6]  
Dietterich T. G., 1999, ARXIVCS9905014
[7]  
Kulkarni T. D., 2016, P NIPS
[8]   Human-level control through deep reinforcement learning [J].
Mnih, Volodymyr ;
Kavukcuoglu, Koray ;
Silver, David ;
Rusu, Andrei A. ;
Veness, Joel ;
Bellemare, Marc G. ;
Graves, Alex ;
Riedmiller, Martin ;
Fidjeland, Andreas K. ;
Ostrovski, Georg ;
Petersen, Stig ;
Beattie, Charles ;
Sadik, Amir ;
Antonoglou, Ioannis ;
King, Helen ;
Kumaran, Dharshan ;
Wierstra, Daan ;
Legg, Shane ;
Hassabis, Demis .
NATURE, 2015, 518 (7540) :529-533
[9]  
Vezhnevets A., 2016, P NIPS