From Tempered to Benign Overfitting in ReLU Neural Networks

被引:0
|
作者
Kornowski, Guy [1 ]
Yehudai, Gilad [1 ]
Shamir, Ohad [1 ]
机构
[1] Weizmann Inst Sci, Rehovot, Israel
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
基金
欧洲研究理事会;
关键词
KERNEL RIDGELESS REGRESSION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Overparameterized neural networks (NNs) are observed to generalize well even when trained to perfectly fit noisy data. This phenomenon motivated a large body of work on "benign overfitting", where interpolating predictors achieve near-optimal performance. Recently, it was conjectured and empirically observed that the behavior of NNs is often better described as "tempered overfitting", where the performance is non-optimal yet also non-trivial, and degrades as a function of the noise level. However, a theoretical justification of this claim for non-linear NNs has been lacking so far. In this work, we provide several results that aim at bridging these complementing views. We study a simple classification setting with 2-layer ReLU NNs, and prove that under various assumptions, the type of overfitting transitions from tempered in the extreme case of one-dimensional data, to benign in high dimensions. Thus, we show that the input dimension has a crucial role on the overfitting profile in this setting, which we also validate empirically for intermediate dimensions. Overall, our results shed light on the intricate connections between the dimension, sample size, architecture and training algorithm on the one hand, and the type of resulting overfitting on the other hand.
引用
收藏
页数:36
相关论文
共 50 条
  • [1] Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization
    Frei, Spencer
    Vardi, Gal
    Bartlett, Peter L.
    Srebro, Nathan
    THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
  • [2] Benign, Tempered, or Catastrophic: A Taxonomy of Overfitting
    Mallinar, Neil
    Simon, James B.
    Abedsoltan, Amirhesam
    Pandit, Parthe
    Belkin, Mikhail
    Nakkiran, Preetum
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [3] Benign Overfitting in Two-layer Convolutional Neural Networks
    Cao, Yuan
    Chen, Zixiang
    Belkin, Mikhail
    Gu, Quanquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [4] Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension
    Haas, Moritz
    Holzmueller, David
    von Luxburg, Ulrike
    Steinwart, Ingo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] An exact mapping from ReLU networks to spiking neural networks
    Stanojevic, Ana
    Wozniak, Stanislaw
    Bellec, Guillaume
    Cherubini, Giovanni
    Pantazi, Angeliki
    Gerstner, Wulfram
    NEURAL NETWORKS, 2023, 168 : 74 - 88
  • [6] Biased ReLU neural networks
    Liang, XingLong
    Xu, Jun
    NEUROCOMPUTING, 2021, 423 : 71 - 79
  • [7] A Vibrating Mechanism to Prevent Neural Networks from Overfitting
    Xiong, Jian
    Zhang, Kai
    Zhang, Hao
    2019 15TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2019, : 1737 - 1742
  • [8] On the Error Bounds for ReLU Neural Networks
    Katende, Ronald
    Kasumba, Henry
    Kakuba, Godwin
    Mango, John
    IAENG International Journal of Applied Mathematics, 2024, 54 (12) : 2602 - 2611
  • [9] Advances in verification of ReLU neural networks
    Ansgar Rössig
    Milena Petkovic
    Journal of Global Optimization, 2021, 81 : 109 - 152
  • [10] Verifying ReLU Neural Networks from a Model Checking Perspective
    Liu, Wan-Wei
    Song, Fu
    Zhang, Tang-Hao-Ran
    Wang, Ji
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2020, 35 (06) : 1365 - 1381