Sampling weights of deep neural networks

被引:0
作者
Bolager, Erik Lien [1 ]
Burak, Iryna [1 ]
Datar, Chinmay [1 ,2 ]
Sun, Qing [1 ]
Dietrich, Felix [1 ]
机构
[1] Tech Univ Munich, Sch Computat Informat & Technol, Munich, Germany
[2] Tech Univ Munich, Inst Adv Study, Munich, Germany
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
关键词
MACHINE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a probability distribution, combined with an efficient sampling algorithm, for weights and biases of fully-connected neural networks. In a supervised learning context, no iterative optimization or gradient computations of internal network parameters are needed to obtain a trained network. The sampling is based on the idea of random feature models. However, instead of a data-agnostic distribution, e.g., a normal distribution, we use both the input and the output training data to sample shallow and deep networks. We prove that sampled networks are universal approximators. For Barron functions, we show that the L-2-approximation error of sampled shallow networks decreases with the square root of the number of neurons. Our sampling scheme is invariant to rigid body transformations and scaling of the input data, which implies many popular pre-processing techniques are not required. In numerical experiments, we demonstrate that sampled networks achieve accuracy comparable to iteratively trained ones, but can be constructed orders of magnitude faster. Our test cases involve a classification benchmark from OpenML, sampling of neural operators to represent maps in function spaces, and transfer learning using well-known architectures.
引用
收藏
页数:42
相关论文
共 66 条
  • [1] [Anonymous], 2019, INT C LEARN REPR
  • [2] [Anonymous], 2019, INT C LEARN REPR, DOI DOI 10.1080/09593985.2019.1709234
  • [3] Bach F., 2017, Journal of Machine Learning Research, V18, P1
  • [4] UNIVERSAL APPROXIMATION BOUNDS FOR SUPERPOSITIONS OF A SIGMOIDAL FUNCTION
    BARRON, AR
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (03) : 930 - 945
  • [5] Bergstra J., 2011, ADV NEURAL INFORM PR, P2546
  • [6] Bischl B., 2021, 35 C NEUR INF PROC S
  • [7] Blundell C, 2015, PR MACH LEARN RES, V37, P1613
  • [8] Bollt E., 2021, CHAOS, V31, P1
  • [9] Bollt Erik M., 2023, NEURAL NETWORK UNPUB
  • [10] Burkholz Rebekka, 2022, INT C LEARN REPR