Universality of deep convolutional neural networks

被引:367
作者
Zhou, Ding-Xuan [1 ,2 ]
机构
[1] City Univ Hong Kong, Sch Data Sci, Kowloon, Hong Kong, Peoples R China
[2] City Univ Hong Kong, Dept Math, Kowloon, Hong Kong, Peoples R China
关键词
Deep learning; Convolutional neural network; Universality; Approximation theory; MULTILAYER FEEDFORWARD NETWORKS; OPTIMAL APPROXIMATION; BOUNDS;
D O I
10.1016/j.acha.2019.06.004
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Deep learning has been widely applied and brought breakthroughs in speech recognition, computer vision, and many other domains. Deep neural network architectures and computational issues have been well studied in machine learning. But there lacks a theoretical foundation for understanding the approximation or generalization ability of deep learning methods generated by the network architectures such as deep convolutional neural networks. Here we show that a deep convolutional neural network (CNN) is universal, meaning that it can be used to approximate any continuous function to an arbitrary accuracy when the depth of the neural network is large enough. This answers an open question in learning theory. Our quantitative estimate, given tightly in terms of the number of free parameters to be computed, verifies the efficiency of deep CNNs in dealing with large dimensional data. Our study also demonstrates the role of convolutions in deep CNNs. (C) 2019 Elsevier Inc. All rights reserved.
引用
收藏
页码:787 / 794
页数:8
相关论文
共 30 条
[11]   Consistency analysis of an empirical minimum error entropy algorithm [J].
Fan, Jun ;
Hu, Ting ;
Wu, Qiang ;
Zhou, Ding-Xuan .
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2016, 41 (01) :164-189
[12]   Learning theory of distributed spectral algorithms [J].
Guo, Zheng-Chu ;
Lin, Shao-Bo ;
Zhou, Ding-Xuan .
INVERSE PROBLEMS, 2017, 33 (07)
[13]   A fast learning algorithm for deep belief nets [J].
Hinton, Geoffrey E. ;
Osindero, Simon ;
Teh, Yee-Whye .
NEURAL COMPUTATION, 2006, 18 (07) :1527-1554
[14]   MULTILAYER FEEDFORWARD NETWORKS ARE UNIVERSAL APPROXIMATORS [J].
HORNIK, K ;
STINCHCOMBE, M ;
WHITE, H .
NEURAL NETWORKS, 1989, 2 (05) :359-366
[15]   Approximation by Combinations of ReLU and Squared ReLU Ridge Functions With l1 and l0 Controls [J].
Klusowski, Jason M. ;
Barron, Andrew R. .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2018, 64 (12) :7649-7656
[16]   ImageNet Classification with Deep Convolutional Neural Networks [J].
Krizhevsky, Alex ;
Sutskever, Ilya ;
Hinton, Geoffrey E. .
COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90
[17]   Gradient-based learning applied to document recognition [J].
Lecun, Y ;
Bottou, L ;
Bengio, Y ;
Haffner, P .
PROCEEDINGS OF THE IEEE, 1998, 86 (11) :2278-2324
[18]   MULTILAYER FEEDFORWARD NETWORKS WITH A NONPOLYNOMIAL ACTIVATION FUNCTION CAN APPROXIMATE ANY FUNCTION [J].
LESHNO, M ;
LIN, VY ;
PINKUS, A ;
SCHOCKEN, S .
NEURAL NETWORKS, 1993, 6 (06) :861-867
[19]  
Lin S.-B., 2017, Journal of Machine Learning Research, V18, P1
[20]   Understanding deep convolutional networks [J].
Mallat, Stephane .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2016, 374 (2065)