Uniform Generalization Bounds on Data-Dependent Hypothesis Sets via PAC-Bayesian Theory on Random Sets

被引:0
作者
Dupuis, Benjamin [1 ]
Viallard, Paul [2 ]
Deligiannidis, George [3 ]
Simsekli, Umut [4 ]
机构
[1] PSL Res Univ, Ecole Normale Super, INRIA, Dept Informat, Paris, France
[2] Univ Rennes, INRIA, CNRS IRISA UMR 6074, Rennes, France
[3] Univ Oxford, Dept Stat, Oxford, England
[4] PSL Res Univ, Ecole Normale Super, Dept Informat, INRIA,CNRS, Paris, France
基金
欧洲研究理事会;
关键词
Uniform Generalization Bounds; PAC-Bayesian Theory; Fractal Geometry; Langevin Dynamics; SGLD;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose data-dependent uniform generalization bounds by approaching the problem from a PAC-Bayesian perspective. We first apply the PAC-Bayesian framework on "random sets" in a rigorous way, where the training algorithm is assumed to output a data-dependent hypothesis set after observing the training data. This approach allows us to prove data- dependent bounds, which can be applicable in numerous contexts. To highlight the power of our approach, we consider two main applications. First, we propose a PAC-Bayesian formulation of the recently developed fractal-dimension-based generalization bounds. The derived results are shown to be tighter and they unify the existing results around one simple proof technique. Second, we prove uniform bounds over the trajectories of continuous Langevin dynamics and stochastic gradient Langevin dynamics. These results provide novel information about the generalization properties of noisy algorithms.
引用
收藏
页数:55
相关论文
共 82 条
[1]   User-friendly Introduction to PAC-Bayes Bounds [J].
Alquier, Pierre .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2024, 17 (02) :174-303
[2]  
Ambroladze Amiran, 2006, Advances in Neural Information Processing Systems
[3]  
Amir I, 2022, Arxiv, DOI arXiv:2202.13328
[4]  
Amit R, 2022, ADV NEUR IN
[5]  
Andreeva R, 2024, Arxiv, DOI arXiv:2407.08723
[6]  
Andreeva Rayna, 2023, ICML 2023 WORKSH TOP
[7]  
Aristoff D, 2012, Arxiv, DOI arXiv:1205.2400
[8]  
Awasthi P, 2020, Arxiv, DOI arXiv:2007.11045
[9]  
Bartlett P. L., 2003, Journal of Machine Learning Research, V3, P463, DOI 10.1162/153244303321897690
[10]  
Bartlett Peter, 2002, Machine Learning