Benign overfitting in linear regression

被引：324

作者：

Bartlett, Peter L. ^{[1
,2
]}

Long, Philip M. ^{[3
]}

Lugosi, Gabor ^{[4
,5
,6
]}

Tsigler, Alexander ^{[1
]}

机构：

[1] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA

[2] Univ Calif Berkeley, Comp Sci Div, Berkeley, CA 94720 USA

[3] Google Brain, Mountain View, CA 94043 USA

[4] Pompeu Fabra Univ, Econ & Business, Barcelona 08005, Spain

[5] Elnstitucio Catalana Recerca & Estudis Avancats, Lluis Co 23, Barcelona 08010, Spain

[6] Barcelona Grad Sch Econ, Barcelona 08005, Spain

来源：

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA | 2020年 / 117卷 / 48期

关键词：

statistical learning theory; overfitting; linear regression; interpolation; NEURAL-NETWORKS;

D O I：

10.1073/pnas.1907378117

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

The phenomenon of benign overfitting is one of the key mysteries uncovered by deep learning methodology: deep neural networks seem to predict well, even with a perfect fit to noisy training data. Motivated by this phenomenon, we consider when a perfect fit to training data in linear regression is compatible with accurate prediction. We give a characterization of linear regression problems for which the minimum norm interpolating prediction rule has near-optimal prediction accuracy. The characterization is in terms of two notions of the effective rank of the data covariance. It shows that overparameterization is essential for benign overfitting in this setting: the number of directions in parameter space that are unimportant for prediction must significantly exceed the sample size. By studying examples of data covariance properties that this characterization shows are required for benign overfitting, we find an important role for finite-dimensional data: the accuracy of the minimum norm interpolating prediction rule approaches the best possible accuracy for a much narrower range of properties of the data distribution when the data lie in an infinite-dimensional space vs. when the data lie in a finite-dimensional space with dimension that grows faster than the sample size.

引用

页码：30063 / 30070

页数：8

共 50 条

[1] Benign Overfitting of Constant-Stepsize SGD for Linear Regression
Zou, Difan
Wu, Jingfeng
Braverman, Vladimir
Gu, Quanquan
Kakade, Sham M.
JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
[2] Benign overfitting in ridge regression
Tsigler, Alexander
Bartlett, Peter L.
JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
[3] Benign overfitting and adaptive nonparametric regression
Chhor, Julien
Sigalla, Suzanne
Tsybakov, Alexandre B.
PROBABILITY THEORY AND RELATED FIELDS, 2024, 189 (3-4) : 949 - 980
[4] Benign Overfitting in Adversarially Robust Linear Classification
Chen, Jinghui
Cao, Yuan
Gu, Quanquan
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 313 - 323
[5] Benign overfitting of non-sparse high-dimensional linear regression with correlated noise
Tsuda, Toshiki
Imaizumi, Masaaki
ELECTRONIC JOURNAL OF STATISTICS, 2024, 18 (02): : 4119 - 4197
[6] Replica analysis of overfitting in generalized linear regression models
Coolen, A. C. C.
Sheikh, M.
Mozeika, A.
Aguirre-Lopez, F.
Antenucci, F.
JOURNAL OF PHYSICS A-MATHEMATICAL AND THEORETICAL, 2020, 53 (36)
[7] Benign Overfitting and Noisy Features
Li, Zhu
Su, Weijie J.
Sejdinovic, Dino
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (544) : 2876 - 2888
[8] The Implicit Bias of Benign Overfitting
Shamir, Ohad
CONFERENCE ON LEARNING THEORY, VOL 178, 2022, 178 : 448 - 478
[9] The Implicit Bias of Benign Overfitting
Shamir, Ohad
JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
[10] The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks
Chatterji, Niladri S.
Long, Philip M.
Bartlett, Peter L.
Journal of Machine Learning Research, 2022, 23

← 1 2 3 4 5 →