What Kinds of Functions Do Deep Neural Networks Learn? Insights from Variational Spline Theory\ast

被引：20

作者：

Parhi, Rahul ^{[1
]}

Nowak, Robert D. ^{[1
]}

机构：

[1] Univ Wisconsin Madison, Dept Elect & Comp Engn, Madison, WI 53706 USA

来源：

SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE | 2022年 / 4卷 / 02期

关键词：

neural networks; deep learning; splines; regularization; sparsity; representer theorem; INVERSE PROBLEMS;

D O I：

10.1137/21M1418642

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

We develop a variational framework to understand the properties of functions learned by fitting deep neural networks with rectified linear unit (ReLU) activations to data. We propose a new function space, which is related to classical bounded variation-type spaces, that captures the compositional structure associated with deep neural networks. We derive a representer theorem showing that deep ReLU networks are solutions to regularized data-fitting problems over functions from this space. The function space consists of compositions of functions from the Banach space of second-order bounded variation in the Radon domain. This Banach space has a sparsity-promoting norm, giving insight into the role of sparsity in deep neural networks. The neural network solutions have skip connections and rank-bounded weight matrices, providing new theoretical support for these common architectural choices. The variational problem we study can be recast as a finite-dimensional neural network training problem with regularization schemes related to the notions of weight decay and path-norm regularization. Finally, our analysis builds on techniques from variational spline theory, providing new connections between deep neural networks and splines.

引用

页码：464 / 489

页数：26

共 11 条

[1] Learning Activation Functions in Deep (Spline) Neural Networks
Bohra, Pakshal
Campos, Joaquim
Gupta, Harshit
Aziznejad, Shayan
Unser, Michael
IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2020, 1 : 295 - 309
[2] Approximation of Lipschitz Functions Using Deep Spline Neural Networks*
Neumayer, Sebastian
Goujon, Alexis
Bohra, Pakshal
Unser, Michael
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2023, 5 (02): : 306 - 322
[3] Modeling memory: What do we learn from attractor neural networks?
Brunel, N
Nadal, JP
COMPTES RENDUS DE L ACADEMIE DES SCIENCES SERIE III-SCIENCES DE LA VIE-LIFE SCIENCES, 1998, 321 (2-3): : 249 - 252
[4] Theory of deep convolutional neural networks III: Approximating radial functions
Mao, Tong
Shi, Zhongjie
Zhou, Ding-Xuan
NEURAL NETWORKS, 2021, 144 : 778 - 790
[5] Deep neural networks for automatic speaker recognition do not learn supra-segmental temporal features
Neururer, Daniel
Dellwo, Volker
Stadelmann, Thilo
PATTERN RECOGNITION LETTERS, 2024, 181 : 64 - 69
[6] Predicting mortality after coronary artery bypass surgery: What do artificial neural networks learn?
Tu, JV
Weinstein, MC
McNeil, BJ
Naylor, CD
MEDICAL DECISION MAKING, 1998, 18 (02) : 229 - 235
[7] Neural networks to learn protein sequence-function relationships from deep mutational scanning data
Gelman, Sam
Fahlberg, Sarah A.
Heinzelman, Pete
Romero, Philip A.
Gitter, Anthony
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2021, 118 (48)
[8] Tell Me, What Do You See?-Interpretable Classification of Wiring Harness Branches with Deep Neural Networks
Kicki, Piotr
Bednarek, Michal
Lembicz, Pawel
Mierzwiak, Grzegorz
Szymko, Amadeusz
Kraft, Marek
Walas, Krzysztof
SENSORS, 2021, 21 (13)
[9] Visual pathways from the perspective of cost functions and multi-task deep neural networks
Scholte, H. Steven
Losch, Max M.
Ramakrishnan, Kandan
de Haan, Edward H. F.
Bohte, Sander M.
CORTEX, 2018, 98 : 249 - 261
[10] The Coherence of the Working Memory Study Between Deep Neural Networks and Neurophysiology: Insights From Distinguishing Topographical Electroencephalogram Data Under Different Workloads
Ming, Yurui
Lin, Chin-Teng
IEEE SYSTEMS MAN AND CYBERNETICS MAGAZINE, 2021, 7 (04): : 24 - 30

← 1 2 →