Approximation Analysis of Convolutional Neural Networks

被引：7

作者：

Bao, Chenglong ^{[1
,2
]}

Li, Qianxiao ^{[3
,4
]}

Shen, Zuowei ^{[3
]}

Tai, Cheng ^{[5
]}

Wu, Lei ^{[6
]}

Xiang, Xueshuang ^{[7
]}

机构：

[1] JingZhai Tsinghua Univ, Yau Math Sci Ctr, Beijing 100084, Peoples R China

[2] YanQi Lake Beijing Inst Math Sci & Applicat, 544 HeFangkou Village, Beijing 101408, Peoples R China

[3] Natl Univ Singapore, Dept Math, BLK S17,10 Lower Kent Ridge Rd, Singapore 119076, Singapore

[4] Natl Univ Singapore, Inst Funct Intelligent Mat, BLK S9,4 Sci Dr 2, Singapore 117544, Singapore

[5] Peking Univ, Beijing Inst Big Data Res, 6 Jingyuan, Beijing 100871, Peoples R China

[6] Peking Univ, Sch Math Sci, Beijing 100871, Peoples R China

[7] Qian Xuesen Lab Space Technol, 104 Youyi Rd, Beijing 100094, Peoples R China

来源：

EAST ASIAN JOURNAL ON APPLIED MATHEMATICS | 2023年 / 13卷 / 03期

基金：

中国国家自然科学基金; 新加坡国家研究基金会;

关键词：

Convolutional networks; approximation; scaling analysis; compositional functions; DEEP; BOUNDS;

D O I：

10.4208/eajam.2022-270.070123

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

In its simplest form, convolution neural networks (CNNs) consist of a fully connected two-layer network g composed with a sequence of convolution layers T. Al-though g is known to have the universal approximation property, it is not known if CNNs, which have the form g degrees T inherit this property, especially when the kernel size in T is small. In this paper, we show that under suitable conditions, CNNs do inherit the universal approximation property and its sample complexity can be characterized. In ad-dition, we discuss concretely how the nonlinearity of T can improve the approximation power. Finally, we show that when the target function class has a certain compositional form, convolutional networks are far more advantageous compared with fully connected networks, in terms of the number of parameters needed to achieve the desired accuracy.

引用

页码：524 / 549

页数：26

共 41 条

[1]

[Anonymous], 2013, MATH INTRO COMPRESSI

[2] UNIVERSAL APPROXIMATION BOUNDS FOR SUPERPOSITIONS OF A SIGMOIDAL FUNCTION [J].

BARRON, AR .

IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (03) :930-945

[3] APPROXIMATION AND ESTIMATION BOUNDS FOR ARTIFICIAL NEURAL NETWORKS [J].

BARRON, AR .

MACHINE LEARNING, 1994, 14 (01) :115-133

[4] Data-driven tight frame construction and image denoising [J].

Cai, Jian-Feng ;

Ji, Hui ;

Shen, Zuowei ;

Ye, Gui-Bo .

APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2014, 37 (01) :89-105

[5] Approximation of frame based missing data recovery [J].

Cai, Jian-Feng ;

Shen, Zuowei ;

Ye, Gui-Bo .

APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2011, 31 (02) :185-204

[6] Decoding by linear programming [J].

Candes, EJ ;

Tao, T .

IEEE TRANSACTIONS ON INFORMATION THEORY, 2005, 51 (12) :4203-4215

[7] The restricted isometry property and its implications for compressed sensing [J].

Candes, Emmanuel J. .

COMPTES RENDUS MATHEMATIQUE, 2008, 346 (9-10) :589-592

[8]

Cohen N.S., 2016, Writer's Rights: Freelance journalism in a digital age

[9]

Cohen N, 2016, PR MACH LEARN RES, V48

[10]

Cucker F, 2002, B AM MATH SOC, V39, P1

← 1 2 3 4 5 →