On Lower Bounds for Statistical Learning Theory

被引:8
作者
Loh, Po-Ling [1 ]
机构
[1] Univ Wisconsin, Dept Elect & Comp Engn, 1415 Engn Dr, Madison, WI 53706 USA
关键词
machine learning; minimax estimation; community recovery; online learning; multi-armed bandits; channel decoding; threshold phenomena; MINIMAX RATES; ENTROPY; CONVERGENCE; INFORMATION; SELECTION;
D O I
10.3390/e19110617
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
In recent years, tools from information theory have played an increasingly prevalent role in statistical machine learning. In addition to developing efficient, computationally feasible algorithms for analyzing complex datasets, it is of theoretical importance to determine whether such algorithms are "optimal" in the sense that no other algorithm can lead to smaller statistical error. This paper provides a survey of various techniques used to derive information-theoretic lower bounds for estimation and learning. We focus on the settings of parameter and function estimation, community recovery, and online learning for multi-armed bandits. A common theme is that lower bounds are established by relating the statistical learning problem to a channel decoding problem, for which lower bounds may be derived involving information-theoretic quantities such as the mutual information, total variation distance, and Kullback-Leibler divergence. We close by discussing the use of information-theoretic quantities to measure independence in machine learning applications ranging from causality to medical imaging, and mention techniques for estimating these quantities efficiently in a data-driven manner.
引用
收藏
页数:17
相关论文
共 50 条
[31]   Lower Bounds on the Capacities of Quantum Relay Channels [J].
Shi Jin-Jing ;
Shi Rong-Hua ;
Peng Xiao-Qi ;
Guo Ying ;
Yi Liu-Yang ;
Lee Moon-Ho .
COMMUNICATIONS IN THEORETICAL PHYSICS, 2012, 58 (04) :487-492
[32]   Lower bounds for Dirichlet Laplacians and uncertainty principles [J].
Stollmann, Peter ;
Stolz, Guenter .
JOURNAL OF THE EUROPEAN MATHEMATICAL SOCIETY, 2021, 23 (07) :2337-2360
[33]   Lower bounds of cowidths and widths of multiplier operators [J].
Kushpel, Alexander .
JOURNAL OF COMPLEXITY, 2022, 69
[34]   Exact Upper and Lower Bounds on the Misclassification Probability [J].
Pinelis, Iosif .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2019, 65 (07) :4327-4334
[35]   Entropy bounds on Bayesian learning [J].
Gossner, Olivier ;
Tomala, Tristan .
JOURNAL OF MATHEMATICAL ECONOMICS, 2008, 44 (01) :24-32
[36]   Learning with semi-definite programming: statistical bounds based on fixed point analysis and excess risk curvature [J].
Chretien, Stephane ;
Cucuringu, Mihai ;
Lecue, Guillaume ;
Neirac, Lucie .
JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
[37]   A Theory of Universal Learning [J].
Bousquet, Olivier ;
Hanneke, Steve ;
Moran, Shay ;
van Handel, Ramon ;
Yehudayoff, Amir .
STOC '21: PROCEEDINGS OF THE 53RD ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2021, :532-541
[38]   A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models [J].
Suh, Namjoon ;
Cheng, Guang .
ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, 2025, 12 :177-207
[39]   Simple Models in Complex Worlds: Occam’s Razor and Statistical Learning Theory [J].
Falco J. Bargagli Stoffi ;
Gustavo Cevolani ;
Giorgio Gnecco .
Minds and Machines, 2022, 32 :13-42
[40]   Simple Models in Complex Worlds: Occam's Razor and Statistical Learning Theory [J].
Bargagli Stoffi, Falco J. ;
Cevolani, Gustavo ;
Gnecco, Giorgio .
MINDS AND MACHINES, 2022, 32 (01) :13-42