From Tempered to Benign Overfitting in ReLU Neural Networks

被引：0

作者：

Kornowski, Guy ^{[1
]}

Yehudai, Gilad ^{[1
]}

Shamir, Ohad ^{[1
]}

机构：

[1] Weizmann Inst Sci, Rehovot, Israel

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

基金：

欧洲研究理事会;

关键词：

KERNEL RIDGELESS REGRESSION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Overparameterized neural networks (NNs) are observed to generalize well even when trained to perfectly fit noisy data. This phenomenon motivated a large body of work on "benign overfitting", where interpolating predictors achieve near-optimal performance. Recently, it was conjectured and empirically observed that the behavior of NNs is often better described as "tempered overfitting", where the performance is non-optimal yet also non-trivial, and degrades as a function of the noise level. However, a theoretical justification of this claim for non-linear NNs has been lacking so far. In this work, we provide several results that aim at bridging these complementing views. We study a simple classification setting with 2-layer ReLU NNs, and prove that under various assumptions, the type of overfitting transitions from tempered in the extreme case of one-dimensional data, to benign in high dimensions. Thus, we show that the input dimension has a crucial role on the overfitting profile in this setting, which we also validate empirically for intermediate dimensions. Overall, our results shed light on the intricate connections between the dimension, sample size, architecture and training algorithm on the one hand, and the type of resulting overfitting on the other hand.

引用

页数：36

共 50 条

[1] Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization
Frei, Spencer
Vardi, Gal
Bartlett, Peter L.
Srebro, Nathan
THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
[2] Benign, Tempered, or Catastrophic: A Taxonomy of Overfitting
Mallinar, Neil
Simon, James B.
Abedsoltan, Amirhesam
Pandit, Parthe
Belkin, Mikhail
Nakkiran, Preetum
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[3] Benign Overfitting in Two-layer Convolutional Neural Networks
Cao, Yuan
Chen, Zixiang
Belkin, Mikhail
Gu, Quanquan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[4] Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension
Haas, Moritz
Holzmueller, David
von Luxburg, Ulrike
Steinwart, Ingo
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[5] An exact mapping from ReLU networks to spiking neural networks
Stanojevic, Ana
Wozniak, Stanislaw
Bellec, Guillaume
Cherubini, Giovanni
Pantazi, Angeliki
Gerstner, Wulfram
NEURAL NETWORKS, 2023, 168 : 74 - 88
[6] Biased ReLU neural networks
Liang, XingLong
Xu, Jun
NEUROCOMPUTING, 2021, 423 : 71 - 79
[7] A Vibrating Mechanism to Prevent Neural Networks from Overfitting
Xiong, Jian
Zhang, Kai
Zhang, Hao
2019 15TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2019, : 1737 - 1742
[8] On the Error Bounds for ReLU Neural Networks
Katende, Ronald
Kasumba, Henry
Kakuba, Godwin
Mango, John
IAENG International Journal of Applied Mathematics, 2024, 54 (12) : 2602 - 2611
[9] Advances in verification of ReLU neural networks
Ansgar Rössig
Milena Petkovic
Journal of Global Optimization, 2021, 81 : 109 - 152
[10] Verifying ReLU Neural Networks from a Model Checking Perspective
Liu, Wan-Wei
Song, Fu
Zhang, Tang-Hao-Ran
Wang, Ji
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2020, 35 (06) : 1365 - 1381

← 1 2 3 4 5 →