Near-Minimax Optimal Estimation With Shallow ReLU Neural Networks

被引：12

作者：

Parhi, Rahul ^{[1
,2
]}

Nowak, Robert D. D. ^{[1
]}

机构：

[1] Univ Wisconsin Madison, Dept Elect & Comp Engn, Madison, WI 53706 USA

[2] Ecole Polytech Fed Lausanne, Biomed Imaging Grp, CH-1015 Lausanne, Switzerland

来源：

IEEE TRANSACTIONS ON INFORMATION THEORY | 2023年 / 69卷 / 02期

关键词：

Estimation; Training; Biological neural networks; TV; Radon; Noise measurement; Neurons; Neural networks; ridge functions; sparsity; function approximation; nonparametric function estimation; NONPARAMETRIC REGRESSION; ASYMPTOTIC EQUIVALENCE; CONVERGENCE-RATES; APPROXIMATION; BOUNDS; MULTIVARIATE; SPLINES;

D O I：

10.1109/TIT.2022.3208653

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We study the problem of estimating an unknown function from noisy data using shallow ReLU neural networks. The estimators we study minimize the sum of squared data-fitting errors plus a regularization term proportional to the squared Euclidean norm of the network weights. This minimization corresponds to the common approach of training a neural network with weight decay. We quantify the performance (mean-squared error) of these neural network estimators when the data-generating function belongs to the second-order Radon-domain bounded variation space. This space of functions was recently proposed as the natural function space associated with shallow ReLU neural networks. We derive a minimax lower bound for the estimation problem for this function space and show that the neural network estimators are minimax optimal up to logarithmic factors. This minimax rate is immune to the curse of dimensionality. We quantify an explicit gap between neural networks and linear methods (which include kernel methods) by deriving a linear minimax lower bound for the estimation problem, showing that linear methods necessarily suffer the curse of dimensionality in this function space. As a result, this paper sheds light on the phenomenon that neural networks seem to break the curse of dimensionality.

引用

页码：1125 / 1140

页数：16

共 50 条

[41] On Transversality of Bent Hyperplane Arrangements and the Topological Expressiveness of ReLU Neural Networks
Grigsby, J. Elisenda
Lindsey, Kathryn
SIAM JOURNAL ON APPLIED ALGEBRA AND GEOMETRY, 2022, 6 (02) : 216 - 242
[42] Approximation Speed of Quantized Versus Unquantized ReLU Neural Networks and Beyond
Gonon, Antoine
Brisebarre, Nicolas
Gribonval, Remi
Riccietti, Elisa
IEEE TRANSACTIONS ON INFORMATION THEORY, 2023, 69 (06) : 3960 - 3977
[43] Approximation in shift-invariant spaces with deep ReLU neural networks
Yang, Yunfei
Li, Zhen
Wang, Yang
NEURAL NETWORKS, 2022, 153 : 269 - 281
[44] Characterization of the Variation Spaces Corresponding to Shallow Neural Networks
Jonathan W. Siegel
Jinchao Xu
Constructive Approximation, 2023, 57 : 1109 - 1132
[45] Efficient identification of wide shallow neural networks with biases
Fornasier, Massimo
Klock, Timo
Mondelli, Marco
Rauchensteiner, Michael
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2025, 77
[46] Multivariate Density Estimation by Neural Networks
Peerlings, Dewi E. W.
van den Brakel, Jan A.
Basturk, Nalan
Puts, Marco J. H.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 2436 - 2447
[47] DEEP NEURAL NETWORKS FOR ESTIMATION AND INFERENCE
Farrell, Max H.
Liang, Tengyuan
Misra, Sanjog
ECONOMETRICA, 2021, 89 (01) : 181 - 213
[48] MINIMAX OPTIMAL RATES OF ESTIMATION IN HIGH DIMENSIONAL ADDITIVE MODELS
Yuan, Ming
Zhou, Ding-Xuan
ANNALS OF STATISTICS, 2016, 44 (06) : 2564 - 2593
[49] Deep ReLU neural networks overcome the curse of dimensionality for partial integrodifferential equations
Gonon, Lukas
Schwab, Christoph
ANALYSIS AND APPLICATIONS, 2023, 21 (01) : 1 - 47
[50] STRONG IDENTIFIABILITY AND OPTIMAL MINIMAX RATES FOR FINITE MIXTURE ESTIMATION
Heinrich, Philippe
Kahn, Jonas
ANNALS OF STATISTICS, 2018, 46 (06) : 2844 - 2870

← 1 2 3 4 5 →