Proof of the Theory-to-Practice Gap in Deep Learning via Sampling Complexity bounds for Neural Network Approximation Spaces

被引：12

作者：

Grohs, Philipp ^{[1
,2
,3
]}

Voigtlaender, Felix ^{[1
,4
,5
]}

机构：

[1] Univ Vienna, Fac Math, Oskar Morgenstern Pl 1, A-1090 Vienna, Austria

[2] Res Platform Data Sci Uni Vienna, Wahringer Str 29-S6, A-1090 Vienna, Austria

[3] Johann Radon Inst, Altenberger Str 69, A-4040 Linz, Austria

[4] Tech Univ Munich, Dept Math, Boltzmannstr 3, D-85748 Garching, Germany

[5] Catholic Univ Eichstatt Ingolstadt KU, Math Inst Machine Learning & Data Sci MIDS, Auf Der Schanz 49, D-85049 Ingolstadt, Germany

来源：

FOUNDATIONS OF COMPUTATIONAL MATHEMATICS | 2024年 / 24卷 / 04期

关键词：

Deep neural networks; Approximation spaces; Information based complexity; Gelfand numbers; Theory-to-computational gaps; Randomized approximation; GAME; GO;

D O I：

10.1007/s10208-023-09607-w

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

We study the computational complexity of (deterministic or randomized) algorithms based on point samples for approximating or integrating functions that can be well approximated by neural networks. Such algorithms (most prominently stochastic gradient descent and its variants) are used extensively in the field of deep learning. One of the most important problems in this field concerns the question of whether it is possible to realize theoretically provable neural network approximation rates by such algorithms. We answer this question in the negative by proving hardness results for the problems of approximation and integration on a novel class of neural network approximation spaces. In particular, our results confirm a conjectured and empirically observed theory-to-practice gap in deep learning. We complement our hardness results by showing that error bounds of a comparable order of convergence are (at least theoretically) achievable.

引用

页码：1085 / 1143

页数：59

共 53 条

[1] The Gap between Theory and Practice in Function Approximation with Deep Neural Networks [J].

Adcock, Ben ;

Dexter, Nick .

SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2021, 3 (02) :624-655

[2]

Aliprantis C.D., 2006, Infinite Dimensional Analysis: A Hitchhiker's Guide

[3]

[Anonymous], 1963, Theory of Approximation of Functions of a Real Variable

[4] Solving inverse problems using data-driven models [J].

Arridge, Simon ;

Maass, Peter ;

Oktem, Ozan ;

Schonlieb, Carola-Bibiane .

ACTA NUMERICA, 2019, 28 :1-174

[5] Searching for exotic particles in high-energy physics with deep learning [J].

Baldi, P. ;

Sadowski, P. ;

Whiteson, D. .

NATURE COMMUNICATIONS, 2014, 5

[6]

Bartlett PL, 2019, J MACH LEARN RES, V20, P1

[7]

Beneventano P., 2020, ARXIV

[8] Analysis of the Generalization Error: Empirical Risk Minimization over Deep Artificial Neural Networks Overcomes the Curse of Dimensionality in the Numerical Approximation of Black-Scholes Partial Differential Equations [J].

Berner, Julius ;

Grohs, Philipp ;

Jentzen, Arnulf .

SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2020, 2 (03) :631-657

[9]

Blum Avrim, 1989, Advances in Neural Information Processing Systems, P494

[10] Optimal Approximation with Sparsely Connected Deep Neural Networks [J].

Boelcskei, Helmut ;

Grohs, Philipp ;

Kutyniok, Gitta ;

Petersen, Philipp .

SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2019, 1 (01) :8-45

← 1 2 3 4 5 6 →