Approximation Analysis of Convolutional Neural Networks

被引:7
作者
Bao, Chenglong [1 ,2 ]
Li, Qianxiao [3 ,4 ]
Shen, Zuowei [3 ]
Tai, Cheng [5 ]
Wu, Lei [6 ]
Xiang, Xueshuang [7 ]
机构
[1] JingZhai Tsinghua Univ, Yau Math Sci Ctr, Beijing 100084, Peoples R China
[2] YanQi Lake Beijing Inst Math Sci & Applicat, 544 HeFangkou Village, Beijing 101408, Peoples R China
[3] Natl Univ Singapore, Dept Math, BLK S17,10 Lower Kent Ridge Rd, Singapore 119076, Singapore
[4] Natl Univ Singapore, Inst Funct Intelligent Mat, BLK S9,4 Sci Dr 2, Singapore 117544, Singapore
[5] Peking Univ, Beijing Inst Big Data Res, 6 Jingyuan, Beijing 100871, Peoples R China
[6] Peking Univ, Sch Math Sci, Beijing 100871, Peoples R China
[7] Qian Xuesen Lab Space Technol, 104 Youyi Rd, Beijing 100094, Peoples R China
基金
中国国家自然科学基金; 新加坡国家研究基金会;
关键词
Convolutional networks; approximation; scaling analysis; compositional functions; DEEP; BOUNDS;
D O I
10.4208/eajam.2022-270.070123
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In its simplest form, convolution neural networks (CNNs) consist of a fully connected two-layer network g composed with a sequence of convolution layers T. Al-though g is known to have the universal approximation property, it is not known if CNNs, which have the form g degrees T inherit this property, especially when the kernel size in T is small. In this paper, we show that under suitable conditions, CNNs do inherit the universal approximation property and its sample complexity can be characterized. In ad-dition, we discuss concretely how the nonlinearity of T can improve the approximation power. Finally, we show that when the target function class has a certain compositional form, convolutional networks are far more advantageous compared with fully connected networks, in terms of the number of parameters needed to achieve the desired accuracy.
引用
收藏
页码:524 / 549
页数:26
相关论文
共 41 条
[1]  
[Anonymous], 2013, MATH INTRO COMPRESSI
[2]   UNIVERSAL APPROXIMATION BOUNDS FOR SUPERPOSITIONS OF A SIGMOIDAL FUNCTION [J].
BARRON, AR .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (03) :930-945
[3]   APPROXIMATION AND ESTIMATION BOUNDS FOR ARTIFICIAL NEURAL NETWORKS [J].
BARRON, AR .
MACHINE LEARNING, 1994, 14 (01) :115-133
[4]   Data-driven tight frame construction and image denoising [J].
Cai, Jian-Feng ;
Ji, Hui ;
Shen, Zuowei ;
Ye, Gui-Bo .
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2014, 37 (01) :89-105
[5]   Approximation of frame based missing data recovery [J].
Cai, Jian-Feng ;
Shen, Zuowei ;
Ye, Gui-Bo .
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2011, 31 (02) :185-204
[6]   Decoding by linear programming [J].
Candes, EJ ;
Tao, T .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2005, 51 (12) :4203-4215
[7]   The restricted isometry property and its implications for compressed sensing [J].
Candes, Emmanuel J. .
COMPTES RENDUS MATHEMATIQUE, 2008, 346 (9-10) :589-592
[8]  
Cohen N.S., 2016, Writer's Rights: Freelance journalism in a digital age
[9]  
Cohen N, 2016, PR MACH LEARN RES, V48
[10]  
Cucker F, 2002, B AM MATH SOC, V39, P1