Perceptron: Learning, Generalization, Model Selection, Fault Tolerance, and Role in the Deep Learning Era

被引:39
作者
Du, Ke-Lin [1 ]
Leung, Chi-Sing [2 ]
Mow, Wai Ho [3 ]
Swamy, M. N. S. [1 ]
机构
[1] Concordia Univ, Dept Elect & Comp Engn, Montreal, PQ H3G 1M8, Canada
[2] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China
[3] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Hong Kong, Peoples R China
关键词
multilayer perceptron; perceptron; backpropagation; stochastic gradient descent; second-order learning; model selection; robust learning; deep learning; FEEDFORWARD NEURAL-NETWORKS; CONJUGATE-GRADIENT ALGORITHM; EXTENDED KALMAN FILTER; ERROR BACKPROPAGATION ALGORITHM; WEIGHT INITIALIZATION METHOD; FUZZY MEMBERSHIP FUNCTIONS; MULTILAYER PERCEPTRONS; TRAINING ALGORITHM; BACK-PROPAGATION; PRUNING ALGORITHM;
D O I
10.3390/math10244730
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The single-layer perceptron, introduced by Rosenblatt in 1958, is one of the earliest and simplest neural network models. However, it is incapable of classifying linearly inseparable patterns. A new era of neural network research started in 1986, when the backpropagation (BP) algorithm was rediscovered for training the multilayer perceptron (MLP) model. An MLP with a large number of hidden nodes can function as a universal approximator. To date, the MLP model is the most fundamental and important neural network model. It is also the most investigated neural network model. Even in this AI or deep learning era, the MLP is still among the few most investigated and used neural network models. Numerous new results have been obtained in the past three decades. This survey paper gives a comprehensive and state-of-the-art introduction to the perceptron model, with emphasis on learning, generalization, model selection and fault tolerance. The role of the perceptron model in the deep learning era is also described. This paper provides a concluding survey of perceptron learning, and it covers all the major achievements in the past seven decades. It also serves a tutorial for perceptron learning.
引用
收藏
页数:46
相关论文
共 391 条
[31]   Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127
[32]   Improving the tolerance of multilayer perceptrons by minimizing the statistical sensitivity to weight deviations [J].
Bernier, JL ;
Ortega, J ;
Rojas, I ;
Prieto, A .
NEUROCOMPUTING, 2000, 31 (1-4) :87-103
[33]   An accurate measure for multilayer perceptron tolerance to weight deviations [J].
Bernier, JL ;
Ortega, J ;
Rodríguez, MM ;
Rojas, I ;
Prieto, A .
NEURAL PROCESSING LETTERS, 1999, 10 (02) :121-130
[34]   A quantitative study of fault tolerance, noise immunity, and generalization ability of MLPs [J].
Bernier, JL ;
Ortega, J ;
Ros, E ;
Rojas, I ;
Prieto, A .
NEURAL COMPUTATION, 2000, 12 (12) :2941-2964
[35]   Assessing the noise immunity and generalization of radial basis function networks [J].
Bernier, JL ;
Díaz, AF ;
Fernández, FJ ;
Cañas, A ;
González, J ;
Martín-Smith, P ;
Ortega, J .
NEURAL PROCESSING LETTERS, 2003, 18 (01) :35-48
[36]   Obtaining fault tolerant multilayer perceptrons using an explicit regularization [J].
Bernier, JL ;
Ortega, J ;
Rojas, I ;
Ros, E ;
Prieto, A .
NEURAL PROCESSING LETTERS, 2000, 12 (02) :107-113
[37]   Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method [J].
Bhaya, A ;
Kaszkurewicz, E .
NEURAL NETWORKS, 2004, 17 (01) :65-71
[38]   A fast training algorithm for neural networks [J].
Bilski, J ;
Rutkowski, L .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-ANALOG AND DIGITAL SIGNAL PROCESSING, 1998, 45 (06) :749-753
[39]   EXACT CALCULATION OF THE HESSIAN MATRIX FOR THE MULTILAYER PERCEPTRON [J].
BISHOP, C .
NEURAL COMPUTATION, 1992, 4 (04) :494-501
[40]  
Bishop C. M., 1995, Neural Networks for Pattern Recognition