The Sample Complexity of One-Hidden-Layer Neural Networks

被引:0
|
作者
Vardi, Gal [1 ,2 ,3 ]
Shamir, Ohad [3 ]
Srebro, Nathan [1 ]
机构
[1] TTI Chicago, Chicago, IL 60637 USA
[2] Hebrew Univ Jerusalem, Jerusalem, Israel
[3] Weizmann Inst Sci, Rehovot, Israel
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022 | 2022年
基金
欧洲研究理事会;
关键词
BOUNDS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study norm-based uniform convergence bounds for neural networks, aiming at a tight understanding of how these are affected by the architecture and type of norm constraint, for the simple class of scalar-valued one-hidden-layer networks, and inputs bounded in Euclidean norm. We begin by proving that in general, controlling the spectral norm of the hidden layer weight matrix is insufficient to get uniform convergence guarantees (independent of the network width), while a stronger Frobenius norm control is sufficient, extending and improving on previous work. Motivated by the proof constructions, we identify and analyze two important settings where (perhaps surprisingly) a mere spectral norm control turns out to be sufficient: First, when the network's activation functions are sufficiently smooth (with the result extending to deeper networks); and second, for certain types of convolutional networks. In the latter setting, we study how the sample complexity is additionally affected by parameters such as the amount of overlap between patches and the overall number of patches.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] On the Sample Complexity for Nonoverlapping Neural Networks
    Michael Schmitt
    Machine Learning, 1999, 37 : 131 - 141
  • [32] A Low-complexity Visual Tracking Approach with Single Hidden Layer Neural Networks
    Dai, Liang
    Zhu, Yuesheng
    Luo, Guibo
    He, Chao
    2014 13TH INTERNATIONAL CONFERENCE ON CONTROL AUTOMATION ROBOTICS & VISION (ICARCV), 2014, : 810 - 814
  • [33] Simultaneous approximations of multivariate functions and their derivatives by neural networks with one hidden layer
    Li, X
    NEUROCOMPUTING, 1996, 12 (04) : 327 - 343
  • [34] Neural Networks with Marginalized Corrupted Hidden Layer
    Li, Yanjun
    Xin, Xin
    Guo, Ping
    NEURAL INFORMATION PROCESSING, PT III, 2015, 9491 : 506 - 514
  • [35] Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima
    Du, Simon S.
    Lee, Jason D.
    Tian, Yuandong
    Poczos, Barnabas
    Singh, Aarti
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [36] Size-independent sample complexity of neural networks
    Golowich, Noah
    Rakhlin, Alexander
    Shamir, Ohad
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2020, 9 (02) : 473 - 504
  • [37] Minimax Lower Bounds for Transfer Learning with Linear and One-hidden Layer Neural Networks
    Kalan, Seyed Mohammadreza Mousavi
    Fabian, Zalan
    Avestimehr, Salman
    Soltanolkotabi, Mahdi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [38] Neural networks for word recognition: Is a hidden layer necessary?
    Dandurand, Frederic
    Hannagan, Thomas
    Grainger, Jonathan
    COGNITION IN FLUX, 2010, : 688 - 693
  • [39] Regularization of hidden layer unit response for neural networks
    Taga, K
    Kameyama, K
    Toraichi, K
    2003 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS, AND SIGNAL PROCESSING, VOLS 1 AND 2, CONFERENCE PROCEEDINGS, 2003, : 348 - 351
  • [40] HOW TO DETERMINE THE STUCTRUE OF THE HIDDEN LAYER IN NEURAL NETWORKS
    魏强
    张士军
    张勇传
    水电能源科学, 1997, (01) : 18 - 22