Best k-Layer Neural Network Approximations

被引:1
|
作者
Lim, Lek-Heng [1 ]
Michalek, Mateusz [2 ,3 ]
Qi, Yang [4 ]
机构
[1] Univ Chicago, Dept Stat, Chicago, IL 60637 USA
[2] Max Planck Inst Math Sci, D-04103 Leipzig, Germany
[3] Univ Konstanz, D-78457 Constance, Germany
[4] Ecole Polytech, INRIA Saclay Ile France, CMAP, IP Paris,CNRS, F-91128 Palaiseau, France
关键词
Neural network; Best approximation; Join loci; Secant loci;
D O I
10.1007/s00365-021-09545-2
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
We show that the empirical risk minimization (ERM) problem for neural networks has no solution in general. Given a training set s(1), ..., s(n) is an element of R-p with corresponding responses t(1), ..., t(n) is an element of R-q, fitting a k-layer neural network v(theta) : R-p -> R-q involves estimation of the weights theta is an element of R-m via an ERM: inf(theta is an element of Rm)Sigma(n)(i=1)parallel to t(i) - v(theta)(s(i))parallel to(2)(2). We show that even for k = 2, this infimum is not attainable in general for common activations like ReLU, hyperbolic tangent, and sigmoid functions. In addition, we deduce that if one attempts to minimize such a loss function in the event when its infimum is not attainable, it necessarily results in values of theta diverging to +/-infinity. We will show that for smooth activations sigma(x) = 1/(1 + exp(-x)) and sigma(x) = tanh(x), such failure to attain an infimum can happen on a positive-measured subset of responses. For the ReLU activation sigma(x) = max(0, x), we completely classify cases where the ERM for a best two-layer neural network approximation attains its infimum. In recent applications of neural networks, where overfitting is commonplace, the failure to attain an infimum is avoided by ensuring that the system of equations t(i) = v(theta)(s(i)), i = 1, ..., n, has a solution. For a two-layer ReLU-activated network, we will show when such a system of equations has a solution generically, i.e., when can such a neural network be fitted perfectly with probability one.
引用
收藏
页码:583 / 604
页数:22
相关论文
共 50 条
  • [11] CANN: Curable Approximations for High-Performance Deep Neural Network Accelerators
    Hanif, Muhammad Abdullah
    Khalid, Faiq
    Shafique, Muhammad
    PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
  • [12] Complex nonlinear neural network prediction with IOWA layer
    Hussain, Walayaty
    Merigo, Jose M.
    Gil-Lafuente, Jaime
    Gao, Honghao
    SOFT COMPUTING, 2023, 27 (08) : 4853 - 4863
  • [13] Complex nonlinear neural network prediction with IOWA layer
    Walayat Hussain
    Jóse M. Merigó
    Jaime Gil-Lafuente
    Honghao Gao
    Soft Computing, 2023, 27 : 4853 - 4863
  • [14] The application of neural network in the research of the atmospheric boundary layer
    Feng, XF
    Shi, QL
    ICEMI'2003: PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE ON ELECTRONIC MEASUREMENT & INSTRUMENTS, VOLS 1-3, 2003, : 1974 - 1977
  • [15] In pursuit of the best Artificial Neural Network for predicting the most complex data
    Sewak, Mohit
    Singh, Sachchidanand
    2015 INTERNATIONAL CONFERENCE ON COMMUNICATION, INFORMATION & COMPUTING TECHNOLOGY (ICCICT), 2015,
  • [16] Recurrent Neural Network Classifier for Three Layer Conceptual Network and Performance Evaluation
    Rhaman, Md. Khalilur
    Endo, Tsutomu
    2008 11TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY: ICCIT 2008, VOLS 1 AND 2, 2008, : 881 - 886
  • [17] Partitioning multi-layer edge network for neural network collaborative computing
    Li, Qiang
    Zhou, Ming-Tuo
    Ren, Tian-Feng
    Jiang, Cheng-Bin
    Chen, Yong
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2023, 2023 (01)
  • [18] Partitioning multi-layer edge network for neural network collaborative computing
    Qiang Li
    Ming-Tuo Zhou
    Tian-Feng Ren
    Cheng-Bin Jiang
    Yong Chen
    EURASIP Journal on Wireless Communications and Networking, 2023
  • [19] k-anonymization of social network data using Neural Network and SVM K-NeuroSVM
    Kaur, Harmanjeet
    Hooda, Nishtha
    Singh, Harpreet
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2023, 72
  • [20] Best uniform rational approximations of functions by orthoprojections
    Pekarskii, AA
    MATHEMATICAL NOTES, 2004, 76 (1-2) : 200 - 208