On minimal representations of shallow ReLU networks

被引：6

作者：

Dereich, Steffen ^{[1
]}

Kassing, Sebastian ^{[1
]}

机构：

[1] Westfal Wilhelms Univ Munster, Inst Math Stochast, Math & Informat, Fachbereich 10,Orleans Ring 10, D-48149 Munster, Germany

来源：

NEURAL NETWORKS | 2022年 / 148卷

关键词：

Neural networks; Shallow networks; Minimal representations; ReLU activation; MULTILAYER FEEDFORWARD NETWORKS;

D O I：

10.1016/j.neunet.2022.01.006

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The realization function of a shallow ReLU network is a continuous and piecewise affine function f : R-d ->& nbsp;R, where the domain Rd is partitioned by a set of n hyperplanes into cells on which f is affine. We show that the minimal representation for f uses either n, n + 1 or n + 2 neurons and we characterize each of the three cases. In the particular case, where the input layer is one-dimensional, minimal representations always use at most n+1 neurons but in all higher dimensional settings there are functions for which n+2 neurons are needed. Then we show that the set of minimal networks representing f forms a C-infinity-submanifold M and we derive the dimension and the number of connected components of M. Additionally, we give a criterion for the hyperplanes that guarantees that a continuous, piecewise affine function is the realization function of an appropriate shallow ReLU network.(c) 2022 Elsevier Ltd. All rights reserved.

引用

页码：121 / 128

页数：8

共 50 条

[41] Efficient Approximation of Deep ReLU Networks for Functions on Low Dimensional Manifolds
Chen, Minshuo
Jiang, Haoming
Liao, Wenjing
Zhao, Tuo
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[42] ReLU networks as surrogate models in mixed-integer linear programs
Grimstad, Bjarne
Andersson, Henrik
COMPUTERS & CHEMICAL ENGINEERING, 2019, 131
[43] NONPARAMETRIC REGRESSION USING DEEP NEURAL NETWORKS WITH RELU ACTIVATION FUNCTION
Schmidt-Hieber, Johannes
ANNALS OF STATISTICS, 2020, 48 (04) : 1875 - 1897
[44] Approximate spectral decomposition of Fisher information matrix for simple ReLU networks
Takeishi, Yoshinari
Iida, Masazumi
Takeuchi, Jun'ichi
NEURAL NETWORKS, 2023, 164 : 691 - 706
[45] Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent
Holzmueller, David
Steinwart, Ingo
JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
[46] Deep ReLU networks and high-order finite element methods
Opschoor, Joost A. A.
Petersen, Philipp C.
Schwab, Christoph
ANALYSIS AND APPLICATIONS, 2020, 18 (05) : 715 - 770
[47] PLATEAU PHENOMENON IN GRADIENT DESCENT TRAINING OF RELU NETWORKS: EXPLANATION, QUANTIFICATION, AND AVOIDANCE
Ainsworth, Mark
Shin, Yeonjong
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2021, 43 (05) : A3438 - A3468
[48] Optimal approximation of piecewise smooth functions using deep ReLU neural networks
Petersen, Philipp
Voigtlaender, Felix
NEURAL NETWORKS, 2018, 108 : 296 - 330
[49] Probabilistic lower bounds for approximation by shallow perceptron networks
Kurkova, Vera
Sanguineti, Marcello
NEURAL NETWORKS, 2017, 91 : 34 - 41
[50] Classification of Data Generated by Gaussian Mixture Models Using Deep ReLU Networks
Zhou, Tian-Yi
Huo, Xiaoming
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 54

← 1 2 3 4 5 →