Over-parametrized deep neural networks minimizing the empirical risk do not generalize well

被引:11
作者
Kohler, Michael [1 ]
Krzyzak, Adam [2 ]
机构
[1] Tech Univ Darmstadt, Fachbereich Math, Schlossgartenstr 7, D-64289 Darmstadt, Germany
[2] Concordia Univ, Dept Comp Sci & Software Engn, 1455 De Maisonneuve Blvd, Montreal, PQ H3G 1M8, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Neural networks; nonparametric regression; over-parametrization; rate of convergence; REGRESSION;
D O I
10.3150/21-BEJ1323
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Recently it was shown in several papers that backpropagation is able to find the global minimum of the empirical risk on the training data using over-parametrized deep neural networks. In this paper, a similar result is shown for deep neural networks with the sigmoidal squasher activation function in a regression setting, and a lower bound is presented which proves that these networks do not generalize well on a new data in the sense that networks which minimize the empirical risk do not achieve the optimal minimax rate of convergence for estimation of smooth regression functions.
引用
收藏
页码:2564 / 2597
页数:34
相关论文
共 42 条
  • [1] Allen-Zhu Z., 2019, ADV NEURAL INFORM PR, P6158
  • [2] Allen-Zhu Z, 2019, PR MACH LEARN RES, V97
  • [3] Arora S., 2019, 7 INT C LEARN REPR I
  • [4] Benign overfitting in linear regression
    Bartlett, Peter L.
    Long, Philip M.
    Lugosi, Gabor
    Tsigler, Alexander
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (48) : 30063 - 30070
  • [5] ON DEEP LEARNING AS A REMEDY FOR THE CURSE OF DIMENSIONALITY IN NONPARAMETRIC REGRESSION
    Bauer, Benedikt
    Kohler, Michael
    [J]. ANNALS OF STATISTICS, 2019, 47 (04) : 2261 - 2285
  • [6] Belkin M., 2020, P 22 INT C ART INT S, V119, P1611
  • [7] Reconciling modern machine-learning practice and the classical bias-variance trade-off
    Belkin, Mikhail
    Hsu, Daniel
    Ma, Siyuan
    Mandal, Soumik
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2019, 116 (32) : 15849 - 15854
  • [8] Braun A., 2019, RATE CONVERGEN UNPUB
  • [9] Bubeck S., 2020, NETWORK SIZE WEIGHTS
  • [10] Cao Y, 2020, AAAI CONF ARTIF INTE, V34, P3349