The Dreaming Variational Autoencoder for Reinforcement Learning Environments

被引：8

作者：

Andersen, Per-Arne ^{[1
]}

Goodwin, Morten ^{[1
]}

Granmo, Ole-Christoffer ^{[1
]}

机构：

[1] Univ Agder, Dept ICT, Grimstad, Norway

来源：

ARTIFICIAL INTELLIGENCE XXXV (AI 2018) | 2018年 / 11311卷

关键词：

Deep reinforcement learning; Environment modeling; Neural networks; Variational autoencoder; Markov decision processes; Exploration; Artificial experience-replay;

D O I：

10.1007/978-3-030-04191-5_11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima. It is likely that the solution to these problems lies in short- and long-term planning, exploration and memory management for reinforcement learning algorithms. Games are often used to benchmark reinforcement learning algorithms as they provide a flexible, reproducible, and easy to control environment. Regardless, few games feature a state-space where results in exploration, memory, and planning are easily perceived. This paper presents The Dreaming Variational Autoencoder (DVAE), a neural network based generative modeling architecture for exploration in environments with sparse feedback. We further present Deep Maze, a novel and flexible maze engine that challenges DVAE in partial and fully-observable state-spaces, long-horizon tasks, and deterministic and stochastic problems. We show initial findings and encourage further work in reinforcement learning driven by generative exploration.

引用

页码：143 / 155

页数：13

共 22 条

[1] Towards a Deep Reinforcement Learning Approach for Tower Line Wars [J].

Andersen, Per-Arne ;

Goodwin, Morten ;

Granmo, Ole-Christoffer .

ARTIFICIAL INTELLIGENCE XXXIV, AI 2017, 2017, 10630 :101-114

[2]

[Anonymous], 2016, BETA VAE LEARNING BA

[3] Deep Reinforcement Learning A brief survey [J].

Arulkumaran, Kai ;

Deisenroth, Marc Peter ;

Brundage, Miles ;

Bharath, Anil Anthony .

IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) :26-38

[4]

Bangaru S.P., 2016, ARXIV PREPRINT ARXIV

[5]

Blundell C., 2016, ARXIV PREPRINT ARXIV

[6]

Buesing L., 2018, ARXIV PREPRINT ARXIV

[7]

Chen K., 2015, DEEP REINFORCEMENT L, P6

[8]

Ha D., 2018, ARXIV PREPRINT ARXIV

[9]

Higgins I, 2017, PR MACH LEARN RES, V70

[10] Reinforcement learning: A survey [J].

Kaelbling, LP ;

Littman, ML ;

Moore, AW .

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 :237-285

← 1 2 3 →