Deep learning in neural networks: An overview

被引:11544
作者
Schmidhuber, Juergen [1 ,2 ]
机构
[1] Univ Lugano, Ist Dalle Molle Studi Intelligenza Artificialale, Swiss AI Lab IDSIA, CH-6928 Manno Lugano, Switzerland
[2] SUPSI, CH-6928 Manno Lugano, Switzerland
关键词
Deep learning; Supervised learning; Unsupervised learning; Reinforcement learning; Evolutionary computation; PRINCIPAL COMPONENT ANALYSIS; LONG-TERM DEPENDENCIES; INFERIOR TEMPORAL CORTEX; SLOW FEATURE ANALYSIS; RECEPTIVE-FIELDS; SPIKING NEURONS; PATTERN-RECOGNITION; OBJECT RECOGNITION; FEATURE-EXTRACTION; SELF-ORGANIZATION;
D O I
10.1016/j.neunet.2014.09.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarizes relevant work, much of it from the previous millennium. Shallow and Deep Learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks. (C) 2014 Published by Elsevier Ltd.
引用
收藏
页码:85 / 117
页数:33
相关论文
共 884 条
[1]   Learning algorithms or Markov decision processes with average cost [J].
Abounadi, J ;
Bertsekas, D ;
Borkar, VS .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2001, 40 (03) :681-698
[2]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[3]   STATISTICAL PREDICTOR IDENTIFICATION [J].
AKAIKE, H .
ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 1970, 22 (02) :203-&
[4]  
Akaike H., 1973, 2 INT S INFORM THEOR, P267
[5]  
Almeida L. B., 1987, IEEE First International Conference on Neural Networks, P609
[6]   STATISTICAL-THEORY OF LEARNING-CURVES UNDER ENTROPIC LOSS CRITERION [J].
AMARI, S ;
MURATA, N .
NEURAL COMPUTATION, 1993, 5 (01) :140-153
[7]   Natural gradient works efficiently in learning [J].
Amari, S .
NEURAL COMPUTATION, 1998, 10 (02) :251-276
[8]   A THEORY OF ADAPTIVE PATTERN CLASSIFIERS [J].
AMARI, S .
IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (03) :299-+
[9]   Dynamics of a recurrent network of spiking neurons before and following learning [J].
Amit, DJ ;
Brunel, N .
NETWORK-COMPUTATION IN NEURAL SYSTEMS, 1997, 8 (04) :373-404
[10]   The effects of adding noise during backpropagation training on a generalization performance [J].
An, GZ .
NEURAL COMPUTATION, 1996, 8 (03) :643-674