Information theoretic perspective on sample complexity

被引:3
作者
Pereg, Deborah [1 ,2 ,3 ,4 ,5 ]
机构
[1] MGH, Wellman Ctr Photomed, Boston, MA USA
[2] Harvard Med Sch, Sch Med, Boston, MA USA
[3] MIT CSAIL, Cambridge, MA USA
[4] MIT MechE, Cambridge, MA 02139 USA
[5] Harvard Sch Engn & Appl Sci, Cambridge, MA 02138 USA
关键词
Information theory; Supervised learning; Sample complexity; Generalization;
D O I
10.1016/j.neunet.2023.08.032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The statistical supervised learning framework assumes an input-output set with a joint probability distribution that is reliably represented by the training dataset. The learning system is then required to output a prediction rule learned from the training dataset's input-output pairs. In this work, we investigate the relationship between the sample complexity, the empirical risk and the generalization error based on the asymptotic equipartition property (AEP) (Shannon, 1948). We provide theoretical guarantees for reliable learning under the information-theoretic AEP, with respect to the generalization error and the sample size in different settings.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:445 / 449
页数:5
相关论文
共 20 条
  • [1] Austin T, 2017, Entropy and ergodic theory
  • [2] Learning Deep Architectures for AI
    Bengio, Yoshua
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01): : 1 - 127
  • [3] Bottou L., 2007, SCALING LEARNING ALG, P321
  • [4] THE INDIVIDUAL ERGODIC THEOREM OF INFORMATION-THEORY
    BREIMAN, L
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1957, 28 (03): : 809 - 811
  • [5] Cover T. M., 1999, Elements of Information Theory
  • [6] Gray RM, 2011, ENTROPY AND INFORMATION THEORY , SECOND EDITION, P395, DOI 10.1007/978-1-4419-7970-4
  • [7] Topics in Multi-User Information Theory
    Kramer, Gerhard
    [J]. FOUNDATIONS AND TRENDS IN COMMUNICATIONS AND INFORMATION THEORY, 2007, 4 (4-5): : 265 - 444
  • [8] THE BASIC THEOREMS OF INFORMATION THEORY
    MCMILLAN, B
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1953, 24 (02): : 196 - 219
  • [9] Fields of Experts
    Roth, Stefan
    Black, Michael J.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2009, 82 (02) : 205 - 229
  • [10] Shalev-Shwartz S, 2014, Understanding machine learning: from theory to algorithms, DOI DOI 10.1017/CBO9781107298019