Expressive power of ReLU and step networks under floating-point operations

被引:0
作者
Park, Yeachan [1 ]
Hwang, Geonho [1 ]
Lee, Wonyeol [2 ]
Park, Sejun [3 ]
机构
[1] Korea Inst Adv Study, Seoul 02455, South Korea
[2] Carnegie Mellon Univ, Comp Sci Dept, Pittsburgh, PA 15213 USA
[3] Korea Univ, Dept Artificial Intelligence, Seoul 02841, South Korea
基金
新加坡国家研究基金会;
关键词
Neural networks; Universal approximation; Memorization; Floating-point arithmetic; NEURAL-NETWORKS; FEEDFORWARD NETWORKS; BOUNDS;
D O I
10.1016/j.neunet.2024.106297
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The study of the expressive power of neural networks has investigated the fundamental limits of neural networks. Most existing results assume real-valued inputs and parameters as well as exact operations during the evaluation of neural networks. However, neural networks are typically executed on computers that can only represent a tiny subset of the reals and apply inexact operations, i.e., most existing results do not apply to neural networks used in practice. In this work, we analyze the expressive power of neural networks under a more realistic setup: when we use floating-point numbers and operations as in practice. Our first set of results assumes floating-point operations where the significand of a float is represented by finite bits but its exponent can take any integer value. Under this setup, we show that neural networks using a binary threshold unit or ReLU can memorize any finite input/output pairs and can approximate any continuous function within an arbitrary error. In particular, the number of parameters in our constructions for universal approximation and memorization coincides with that in classical results assuming exact mathematical operations. We also show similar results on memorization and universal approximation when floating-point operations use finite bits for both significand and exponent; these results are applicable to many popular floating-point formats such as those defined in the IEEE 754 standard (e.g., 32-bit single-precision format) and bfloat16.
引用
收藏
页数:18
相关论文
共 29 条
  • [1] Abadi M, 2015, TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems
  • [2] [Anonymous], 2019, IEEE standard for floating-point arithmetic: Standard
  • [3] [Anonymous], 2006, Elementary functions
  • [4] Bartlett PL, 2019, J MACH LEARN RES, V20, P1
  • [5] Baum E. B., 1988, Journal of Complexity
  • [6] Floating-point arithmetic
    Boldo, Sylvie
    Jeannerod, Claude-Pierre
    Melquiond, Guillaume
    Muller, Jean-Michel
    [J]. ACTA NUMERICA, 2023, 32 : 203 - 290
  • [7] Stupid is as Stupid Does: Taking the Square Root of the Square of a Floating-Point Number
    Boldo, Sylvie
    [J]. ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2015, 317 : 27 - 32
  • [8] Flocq: A Unified Library for Proving Floating-point Algorithms in Coq
    Boldo, Sylvie
    Melquiond, Guillaume
    [J]. 2011 20TH IEEE SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH-20), 2011, : 243 - 252
  • [9] Cybenko G., 1989, Mathematics of Control, Signals, and Systems, V2, P303, DOI 10.1007/BF02551274
  • [10] Ding Yukun, 2019, INT C LEARN REPR