Near-Minimax Optimal Estimation With Shallow ReLU Neural Networks

被引:12
作者
Parhi, Rahul [1 ,2 ]
Nowak, Robert D. D. [1 ]
机构
[1] Univ Wisconsin Madison, Dept Elect & Comp Engn, Madison, WI 53706 USA
[2] Ecole Polytech Fed Lausanne, Biomed Imaging Grp, CH-1015 Lausanne, Switzerland
关键词
Estimation; Training; Biological neural networks; TV; Radon; Noise measurement; Neurons; Neural networks; ridge functions; sparsity; function approximation; nonparametric function estimation; NONPARAMETRIC REGRESSION; ASYMPTOTIC EQUIVALENCE; CONVERGENCE-RATES; APPROXIMATION; BOUNDS; MULTIVARIATE; SPLINES;
D O I
10.1109/TIT.2022.3208653
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study the problem of estimating an unknown function from noisy data using shallow ReLU neural networks. The estimators we study minimize the sum of squared data-fitting errors plus a regularization term proportional to the squared Euclidean norm of the network weights. This minimization corresponds to the common approach of training a neural network with weight decay. We quantify the performance (mean-squared error) of these neural network estimators when the data-generating function belongs to the second-order Radon-domain bounded variation space. This space of functions was recently proposed as the natural function space associated with shallow ReLU neural networks. We derive a minimax lower bound for the estimation problem for this function space and show that the neural network estimators are minimax optimal up to logarithmic factors. This minimax rate is immune to the curse of dimensionality. We quantify an explicit gap between neural networks and linear methods (which include kernel methods) by deriving a linear minimax lower bound for the estimation problem, showing that linear methods necessarily suffer the curse of dimensionality in this function space. As a result, this paper sheds light on the phenomenon that neural networks seem to break the curse of dimensionality.
引用
收藏
页码:1125 / 1140
页数:16
相关论文
共 50 条
  • [41] On Transversality of Bent Hyperplane Arrangements and the Topological Expressiveness of ReLU Neural Networks
    Grigsby, J. Elisenda
    Lindsey, Kathryn
    SIAM JOURNAL ON APPLIED ALGEBRA AND GEOMETRY, 2022, 6 (02) : 216 - 242
  • [42] Approximation Speed of Quantized Versus Unquantized ReLU Neural Networks and Beyond
    Gonon, Antoine
    Brisebarre, Nicolas
    Gribonval, Remi
    Riccietti, Elisa
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2023, 69 (06) : 3960 - 3977
  • [43] Approximation in shift-invariant spaces with deep ReLU neural networks
    Yang, Yunfei
    Li, Zhen
    Wang, Yang
    NEURAL NETWORKS, 2022, 153 : 269 - 281
  • [44] Characterization of the Variation Spaces Corresponding to Shallow Neural Networks
    Jonathan W. Siegel
    Jinchao Xu
    Constructive Approximation, 2023, 57 : 1109 - 1132
  • [45] Efficient identification of wide shallow neural networks with biases
    Fornasier, Massimo
    Klock, Timo
    Mondelli, Marco
    Rauchensteiner, Michael
    APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2025, 77
  • [46] Multivariate Density Estimation by Neural Networks
    Peerlings, Dewi E. W.
    van den Brakel, Jan A.
    Basturk, Nalan
    Puts, Marco J. H.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 2436 - 2447
  • [47] DEEP NEURAL NETWORKS FOR ESTIMATION AND INFERENCE
    Farrell, Max H.
    Liang, Tengyuan
    Misra, Sanjog
    ECONOMETRICA, 2021, 89 (01) : 181 - 213
  • [48] MINIMAX OPTIMAL RATES OF ESTIMATION IN HIGH DIMENSIONAL ADDITIVE MODELS
    Yuan, Ming
    Zhou, Ding-Xuan
    ANNALS OF STATISTICS, 2016, 44 (06) : 2564 - 2593
  • [49] Deep ReLU neural networks overcome the curse of dimensionality for partial integrodifferential equations
    Gonon, Lukas
    Schwab, Christoph
    ANALYSIS AND APPLICATIONS, 2023, 21 (01) : 1 - 47
  • [50] STRONG IDENTIFIABILITY AND OPTIMAL MINIMAX RATES FOR FINITE MIXTURE ESTIMATION
    Heinrich, Philippe
    Kahn, Jonas
    ANNALS OF STATISTICS, 2018, 46 (06) : 2844 - 2870