FeUdal Networks for Hierarchical Reinforcement Learning

被引:0
作者
Vezhnevets, Alexander Sasha [1 ]
Osindero, Simon [1 ]
Schaul, Tom [1 ]
Heess, Nicolas [1 ]
Jaderberg, Max [1 ]
Silver, David [1 ]
Kavukcuoglu, Koray [1 ]
机构
[1] DeepMind, London, England
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70 | 2017年 / 70卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce FeUdal Networks (FuNs): a novel architecture for hierarchical reinforcement learning. Our approach is inspired by the feudal reinforcement learning proposal of Dayan and Hinton, and gains power and efficacy by decoupling end-to-end learning across multiple levels - allowing it to utilise different resolutions of time. Our framework employs a Manager module and a Worker module. The Manager operates at a lower temporal resolution and sets abstract goals which are conveyed to and enacted by the Worker. The Worker generates primitive actions at every tick of the environment. The decoupled structure of FuN conveys several benefits - in addition to facilitating very long timescale credit assignment it also encourages the emergence of sub-policies associated with different goals set by the Manager. These properties allow FuN to dramatically outperform a strong baseline agent on tasks that involve long-term credit assignment or memorisation.
引用
收藏
页数:10
相关论文
共 35 条
  • [11] Dayan Peter, 1993, Advances in neural information processing systems, P271
  • [12] Dietterich Thomas G., 2000, J ARTIF INTELL RES J
  • [13] Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
  • [14] Jaderberg Max, 2016, ARXIV161105397
  • [15] Kaelbling Leslie Pack, 2014, ICML
  • [16] Kulkarni TD., 2016, ADV NEURAL INFORM PR, P3682
  • [17] Building machines that learn and think like people
    Lake, Brenden M.
    Ullman, Tomer D.
    Tenenbaum, Joshua B.
    Gershman, Samuel J.
    [J]. BEHAVIORAL AND BRAIN SCIENCES, 2017, 40
  • [18] Lillicrap T. P., 2015, 4 INT C LEARN REPR I, DOI [10.48550/arXiv.1509.02971, DOI 10.48550/ARXIV.1509.02971]
  • [19] Mnih V, 2016, PR MACH LEARN RES, V48
  • [20] Human-level control through deep reinforcement learning
    Mnih, Volodymyr
    Kavukcuoglu, Koray
    Silver, David
    Rusu, Andrei A.
    Veness, Joel
    Bellemare, Marc G.
    Graves, Alex
    Riedmiller, Martin
    Fidjeland, Andreas K.
    Ostrovski, Georg
    Petersen, Stig
    Beattie, Charles
    Sadik, Amir
    Antonoglou, Ioannis
    King, Helen
    Kumaran, Dharshan
    Wierstra, Daan
    Legg, Shane
    Hassabis, Demis
    [J]. NATURE, 2015, 518 (7540) : 529 - 533