Uniform Convergence of Deep Neural Networks With Lipschitz Continuous Activation Functions and Variable Widths

被引:0
作者
Xu, Yuesheng [1 ]
Zhang, Haizhang [2 ]
机构
[1] Old Dominion Univ, Dept Math & Stat, Norfolk, VA 23529 USA
[2] Sun Yat Sen Univ, Sch Math Zhuhai, Zhuhai 519082, Peoples R China
基金
美国国家科学基金会; 美国国家卫生研究院; 中国国家自然科学基金;
关键词
Convergence; Vectors; Artificial neural networks; Kernel; Training; Deep learning; Uniform convergence; deep neural networks; convolutional neural networks; Lipschitz continuous activation functions; variable widths; RELU NETWORKS; ERROR-BOUNDS;
D O I
10.1109/TIT.2024.3439136
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider deep neural networks (DNNs) with a Lipschitz continuous activation function and with weight matrices of variable widths. We establish a uniform convergence analysis framework in which sufficient conditions on weight matrices and bias vectors together with the Lipschitz constant are provided to ensure uniform convergence of DNNs to a meaningful function as the number of their layers tends to infinity. In the framework, special results on uniform convergence of DNNs with a fixed width, bounded widths and unbounded widths are presented. In particular, as convolutional neural networks are special DNNs with weight matrices of increasing widths, we put forward conditions on the mask sequence which lead to uniform convergence of the resulting convolutional neural networks. The Lipschitz continuity assumption on the activation functions allows us to include in our theory most of commonly used activation functions in applications.
引用
收藏
页码:7125 / 7142
页数:18
相关论文
共 39 条
  • [1] Cho Y., 2009, Advances in Neural Information Processing Systems, Vvol 22, ppp 342
  • [2] Clevert D.-A., 2016, ARXIV
  • [3] Lipschitz Certificates for Layered Network Structures Driven by Averaged Activation Operators
    Combettes, Patrick L.
    Pesquet, Jean-Christophe
    [J]. SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2020, 2 (02): : 529 - 557
  • [4] Cucker F, 2007, C MO AP C M, P1, DOI 10.1017/CBO9780511618796
  • [5] Cucker F, 2002, B AM MATH SOC, V39, P1
  • [6] Daniely A, 2016, ADV NEUR IN, V29
  • [7] Daubechies I, 2022, CONSTR APPROX, V55, P127, DOI 10.1007/s00365-021-09548-z
  • [8] Neural network approximation
    DeVore, Ronald
    Hanin, Boris
    Petrova, Guergana
    [J]. ACTA NUMERICA, 2021, 30 : 327 - 444
  • [9] Exponential convergence of the deep neural network approximation for analytic functions
    E, Weinan
    Wang, Qingcan
    [J]. SCIENCE CHINA-MATHEMATICS, 2018, 61 (10) : 1733 - 1740
  • [10] Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1