On size-independent sample complexity of ReLU networks

被引:0
作者
Sellke, Mark [1 ]
机构
[1] Harvard Stat, Cambridge, MA USA
关键词
Neural networks; Rademacher complexity; Generalization; Theory of computation;
D O I
10.1016/j.ipl.2024.106482
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study the sample complexity of learning ReLU neural networks from the point of view of generalization. Given norm constraints on the weight matrices, a common approach is to estimate the Rademacher complexity of the associated function class. Previously [9] obtained a bound independent of the network size (scaling with a product of Frobenius norms) except for a factor of the square -root depth. We give a refinement which often has no explicit depth -dependence at all.
引用
收藏
页数:3
相关论文
共 50 条
  • [31] On the uniform approximation estimation of deep ReLU networks via frequency decomposition
    Chen, Liang
    Liu, Wenjun
    AIMS MATHEMATICS, 2022, 7 (10): : 19018 - 19025
  • [32] Gradient descent optimizes over-parameterized deep ReLU networks
    Zou, Difan
    Cao, Yuan
    Zhou, Dongruo
    Gu, Quanquan
    MACHINE LEARNING, 2020, 109 (03) : 467 - 492
  • [33] New Error Bounds for Deep ReLU Networks Using Sparse Grids
    Montanelli, Hadrien
    Du, Qiang
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2019, 1 (01): : 78 - 92
  • [34] Deep ReLU networks and high-order finite element methods
    Opschoor, Joost A. A.
    Petersen, Philipp C.
    Schwab, Christoph
    ANALYSIS AND APPLICATIONS, 2020, 18 (05) : 715 - 770
  • [35] Approximate spectral decomposition of Fisher information matrix for simple ReLU networks
    Takeishi, Yoshinari
    Iida, Masazumi
    Takeuchi, Jun'ichi
    NEURAL NETWORKS, 2023, 164 : 691 - 706
  • [36] Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent
    Holzmueller, David
    Steinwart, Ingo
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [37] Efficient Approximation of Deep ReLU Networks for Functions on Low Dimensional Manifolds
    Chen, Minshuo
    Jiang, Haoming
    Liao, Wenjing
    Zhao, Tuo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [38] ReLU networks as surrogate models in mixed-integer linear programs
    Grimstad, Bjarne
    Andersson, Henrik
    COMPUTERS & CHEMICAL ENGINEERING, 2019, 131
  • [39] PLATEAU PHENOMENON IN GRADIENT DESCENT TRAINING OF RELU NETWORKS: EXPLANATION, QUANTIFICATION, AND AVOIDANCE
    Ainsworth, Mark
    Shin, Yeonjong
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2021, 43 (05) : A3438 - A3468
  • [40] Gradient Descent Provably Escapes Saddle Points in the Training of Shallow ReLU Networks
    Cheridito, Patrick
    Jentzen, Arnulf
    Rossmannek, Florian
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2024, 203 (03) : 2617 - 2648