Banach Space Representer Theorems for Neural Networks and Ridge Splines

被引：0

作者：

Parhi, Rahul ^{[1
]}

Nowak, Robert D. ^{[1
]}

机构：

[1] Univ Wisconsin, Dept Elect & Comp Engn, 1415 Johnson Dr, Madison, WI 53706 USA

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2021年 / 22卷

基金：

美国国家科学基金会;

关键词：

neural networks; splines; inverse problems; regularization; sparsity; MULTILAYER FEEDFORWARD NETWORKS; APPROXIMATION; REGULARIZATION;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We develop a variational framework to understand the properties of the functions learned by neural networks fit to data. We propose and study a family of continuous-domain linear inverse problems with total variation-like regularization in the Radon domain subject to data fitting constraints. We derive a representer theorem showing that finite-width, single-hidden layer neural networks are solutions to these inverse problems. We draw on many techniques from variational spline theory and so we propose the notion of polynomial ridge splines, which correspond to single-hidden layer neural networks with truncated power functions as the activation function. The representer theorem is reminiscent of the classical reproducing kernel Hilbert space representer theorem, but we show that the neural network problem is posed over a non-Hilbertian Banach space. While the learning problems are posed in the continuous-domain, similar to kernel methods, the problems can be recast as finite-dimensional neural network training problems. These neural network training problems have regularizers which are related to the well-known weight decay and path-norm regularizers. Thus, our result gives insight into functional characteristics of trained neural networks and also into the design neural network regularizers. We also show that these regularizers promote neural network solutions with desirable generalization properties.

引用

页数：40

共 74 条

[1] BREAKING THE COHERENCE BARRIER: A NEW THEORY FOR COMPRESSED SENSING
Adcock, Ben
Hansen, Anders C.
Poon, Clarice
Roman, Bogdan
[J]. FORUM OF MATHEMATICS SIGMA, 2017, 5 : 1 - 84
[2] Generalized Sampling and Infinite-Dimensional Compressed Sensing
Adcock, Ben
Hansen, Anders C.
[J]. FOUNDATIONS OF COMPUTATIONAL MATHEMATICS, 2016, 16 (05) : 1263 - 1323
[3] [Anonymous], 1990, CBMS NSF REGIONAL C, DOI DOI 10.1137/1.9781611970128
[4] [Anonymous], 1995, OXFORD MATH MONOGRAP
[5] [Anonymous], 2014, UNDERSTANDING MACHIN
[6] Bach F, 2017, J MACH LEARN RES, V18
[7] Balestriero R., 2018, INT C MACH LEARN, V80, P374
[8] Barron Andrew R, 2019, ARXIV190200800
[9] UNIVERSAL APPROXIMATION BOUNDS FOR SUPERPOSITIONS OF A SIGMOIDAL FUNCTION
BARRON, AR
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (03) : 930 - 945
[10] Bartlett P. L., 2003, Journal of Machine Learning Research, V3, P463, DOI 10.1162/153244303321897690

← 1 2 3 4 5 6 7 8 →