Approximation Speed of Quantized Versus Unquantized ReLU Neural Networks and Beyond

被引:0
作者
Gonon, Antoine [1 ]
Brisebarre, Nicolas [1 ]
Gribonval, Remi [1 ]
Riccietti, Elisa [1 ]
机构
[1] Univ Lyon, EnsL, UCBL, CNRS,Inria,LIP, F-69342 Lyon, France
基金
中国国家自然科学基金;
关键词
Approximation; neural nets; quantization;
D O I
10.1109/TIT.2023.3240360
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We deal with two complementary questions about approximation properties of ReLU networks. First, we study how the uniform quantization of ReLU networks with real-valued weights impacts their approximation properties. We establish an upper-bound on the minimal number of bits per coordinate needed for uniformly quantized ReLU networks to keep the same polynomial asymptotic approximation speeds as unquantized ones. We also characterize the error of nearest-neighbour uniform quantization of ReLU networks. This is achieved using a new lower-bound on the Lipschitz constant of the map that associates the parameters of ReLU networks to their realization, and an upper-bound generalizing classical results. Second, we investigate when ReLU networks can be expected, or not, to have better approximation properties than other classical approximation families. Indeed, several approximation families share the following common limitation: their polynomial asymptotic approximation speed of any set is bounded from above by the encoding speed of this set. We introduce a new abstract property of approximation families, called 8-encodability, which implies this upper-bound. Many classical approximation families, defined with dictionaries or ReLU networks, are shown to be 8- encodable. This unifies and generalizes several situations where this upper-bound is known.
引用
收藏
页码:3960 / 3977
页数:18
相关论文
共 19 条
[1]  
Arora R., 2018, P INT C LEARN REPR, P98
[2]   Analysis of the Generalization Error: Empirical Risk Minimization over Deep Artificial Neural Networks Overcomes the Curse of Dimensionality in the Numerical Approximation of Black-Scholes Partial Differential Equations [J].
Berner, Julius ;
Grohs, Philipp ;
Jentzen, Arnulf .
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2020, 2 (03) :631-657
[3]   Optimal Approximation with Sparsely Connected Deep Neural Networks [J].
Boelcskei, Helmut ;
Grohs, Philipp ;
Kutyniok, Gitta ;
Petersen, Philipp .
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2019, 1 (01) :8-45
[4]  
Courbariaux M, 2015, ADV NEUR IN, V28
[5]   Neural network approximation [J].
DeVore, Ronald ;
Hanin, Boris ;
Petrova, Guergana .
ACTA NUMERICA, 2021, 30 :327-444
[6]  
DeVore Ronald A., 1993, Constructive approximation, Grudlehren der Mathematischen Wissenschaften Fundamental principales of Mathematical Sciences, V303
[7]   Deep Neural Network Approximation Theory [J].
Elbrachter, Dennis ;
Perekrestenko, Dmytro ;
Grohs, Philipp ;
Boelcskei, Helmut .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2021, 67 (05) :2581-2623
[8]  
Grohs P, 2015, APPL NUMER HARMON AN, V68, P199, DOI 10.1007/978-3-319-18863-8_5
[9]   Approximation rates for neural networks with encodable weights in smoothness spaces [J].
Guehring, Ingo ;
Raslan, Mones .
NEURAL NETWORKS, 2021, 134 :107-130
[10]   Entropy, universal coding, approximation, and bases properties [J].
Kerkyacharian, G ;
Picard, D .
CONSTRUCTIVE APPROXIMATION, 2003, 20 (01) :1-37