Evolving Deep Unsupervised Convolutional Networks for Vision-Based Reinforcement Learning

被引:73
作者
Koutnik, Jan [1 ]
Schmidhuber, Juergen [1 ]
Gomez, Faustino [1 ]
机构
[1] USI SUPSI, IDSIA, CH-6928 Manno Lugano, Switzerland
来源
GECCO'14: PROCEEDINGS OF THE 2014 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE | 2014年
关键词
deep learning; neuroevolution; vision-based TORCS; reinforcement learning; games; NEURAL NETS;
D O I
10.1145/2576768.2598358
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dealing with high-dimensional input spaces, like visual input, is a challenging task for reinforcement learning (RL). Neuroevolution (NE), used for continuous RL problems, has to either reduce the problem dimensionality by (1) compressing the representation of the neural network controllers or (2) employing a pre-processor (compressor) that transforms the high-dimensional raw inputs into low-dimensional features. In this paper, we are able to evolve extremely small recurrent neural network (RNN) controllers for a task that previously required networks with over a million weights. The high-dimensional visual input, which the controller would normally receive, is first transformed into a compact feature vector through a deep, max-pooling convolutional neural network (MPCNN). Both the MPCNN preprocessor and the RNN controller are evolved successfully to control a car in the TORCS racing simulator using only visual input. This is the first use of deep learning in the context evolutionary RL.
引用
收藏
页码:541 / 548
页数:8
相关论文
共 24 条
  • [11] Gomez F, 2008, J MACH LEARN RES, V9, P937
  • [12] Gruau F., 1992, RR9221 EC NORM SUP L
  • [13] Closed-loop learning of visual control policies
    Jodogne, Sebastien
    Piater, Justus H.
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2007, 28 : 349 - 391
  • [14] Kitano H., 1990, Complex Systems, V4, P461
  • [15] Koutnik J., 2013, P GEN EV COMP C GECC
  • [16] Lange S, 2012, 2012 INT JOINT C NEU, P1, DOI [DOI 10.1109/IJCNN.2012.6252823, 10.1109/IJCNN.2012.6252823]
  • [17] Lange S, 2010, IEEE IJCNN
  • [18] Gradient-based learning applied to document recognition
    Lecun, Y
    Bottou, L
    Bengio, Y
    Haffner, P
    [J]. PROCEEDINGS OF THE IEEE, 1998, 86 (11) : 2278 - 2324
  • [19] Reinforcement Learning on Slow Features of High-Dimensional Input Streams
    Legenstein, Robert
    Wilbert, Niko
    Wiskott, Laurenz
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2010, 6 (08)
  • [20] Map learning with uninterpreted sensors and effectors
    Pierce, D
    Kuipers, BJ
    [J]. ARTIFICIAL INTELLIGENCE, 1997, 92 (1-2) : 169 - 227