Theory of deep convolutional neural networks: Downsampling

被引:149
作者
Zhou, Ding-Xuan [1 ,2 ]
机构
[1] City Univ Hong Kong, Sch Data Sci, Kowloon, Hong Kong, Peoples R China
[2] City Univ Hong Kong, Dept Math, Kowloon, Hong Kong, Peoples R China
关键词
Deep learning; Convolutional neural networks; Approximation theory; Downsampling; Filter masks; MULTILAYER FEEDFORWARD NETWORKS; OPTIMAL APPROXIMATION; REGRESSION; ALGORITHM; BOUNDS;
D O I
10.1016/j.neunet.2020.01.018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Establishing a solid theoretical foundation for structured deep neural networks is greatly desired due to the successful applications of deep learning in various practical domains. This paper aims at an approximation theory of deep convolutional neural networks whose structures are induced by convolutions. To overcome the difficulty in theoretical analysis of the networks with linearly increasing widths arising from convolutions, we introduce a downsampling operator to reduce the widths. We prove that the downsampled deep convolutional neural networks can be used to approximate ridge functions nicely, which hints some advantages of these structured networks in terms of approximation or modeling. We also prove that the output of any multi-layer fully-connected neural network can be realized by that of a downsampled deep convolutional neural network with free parameters of the same order, which shows that in general, the approximation ability of deep convolutional neural networks is at least as good as that of fully-connected networks. Finally, a theorem for approximating functions on Riemannian manifolds is presented, which demonstrates that deep convolutional neural networks can be used to learn manifold features of data. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页码:319 / 327
页数:9
相关论文
共 30 条
[1]  
[Anonymous], 1992, CBMS-NSF Regional Conference Series in Applied Mathematics, DOI [DOI 10.1137/1.9781611970104, 10.1137/1.9781611970104]
[2]   UNIVERSAL APPROXIMATION BOUNDS FOR SUPERPOSITIONS OF A SIGMOIDAL FUNCTION [J].
BARRON, AR .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (03) :930-945
[3]   Optimal Approximation with Sparsely Connected Deep Neural Networks [J].
Boelcskei, Helmut ;
Grohs, Philipp ;
Kutyniok, Gitta ;
Petersen, Philipp .
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2019, 1 (01) :8-45
[4]   Limitations of the approximation capabilities of neural networks with one hidden layer [J].
Chui, CK ;
Li, X ;
Mhaskar, HN .
ADVANCES IN COMPUTATIONAL MATHEMATICS, 1996, 5 (2-3) :233-243
[5]  
Cybenko G., 1989, Mathematics of Control, Signals, and Systems, V2, P303, DOI 10.1007/BF02551274
[6]   Consistency analysis of an empirical minimum error entropy algorithm [J].
Fan, Jun ;
Hu, Ting ;
Wu, Qiang ;
Zhou, Ding-Xuan .
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2016, 41 (01) :164-189
[7]  
Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1
[8]  
Gordon Y, 2002, CONSTR APPROX, V18, P61
[9]   Thresholded spectral algorithms for sparse approximations [J].
Guo, Zheng-Chu ;
Xiang, Dao-Hong ;
Guo, Xin ;
Zhou, Ding-Xuan .
ANALYSIS AND APPLICATIONS, 2017, 15 (03) :433-455
[10]   A fast learning algorithm for deep belief nets [J].
Hinton, Geoffrey E. ;
Osindero, Simon ;
Teh, Yee-Whye .
NEURAL COMPUTATION, 2006, 18 (07) :1527-1554