Generalization Bounds of Deep Neural Networks With τ -Mixing Samples

被引：0

作者：

Liu, Liyuan ^{[1
,2
]}

Chen, Yaohui ^{[3
]}

Li, Weifu ^{[1
,2
]}

Wang, Yingjie ^{[4
]}

Gu, Bin ^{[5
]}

Zheng, Feng ^{[6
]}

Chen, Hong ^{[1
,2
]}

机构：

[1] Huazhong Agr Univ, Coll Informat, Wuhan 430070, Peoples R China

[2] Minist Educ, Engn Res Ctr Intelligent Technol Agr, Wuhan 430070, Peoples R China

[3] Huazhong Agr Univ, Coll Engn, Wuhan 430070, Peoples R China

[4] China Univ Petr East China, Coll Control Sci & Engn, Qingdao 266580, Peoples R China

[5] Jilin Univ, Sch Artificial Intelligence, Changchun 130012, Peoples R China

[6] Southern Univ Sci & Technol, Dept Comp Sci & Engn, Shenzhen 518055, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2025年

基金：

中国国家自然科学基金;

关键词：

Estimation; Convergence; Analytical models; Artificial neural networks; Time series analysis; Vectors; Robustness; Lips; Learning systems; Hidden Markov models; tau-mixing; covering number; deep neural networks (DNNs); generalization bounds; TIME-SERIES; INEQUALITIES; SEQUENCES;

D O I：

10.1109/TNNLS.2025.3526235

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural networks (DNNs) have shown an astonishing ability to unlock the complicated relationships among the inputs and their responses. Along with empirical successes, some approximation analysis of DNNs has also been provided to understand their generalization performance. However, the existing analysis depends heavily on the independently identically distribution (i.i.d.) assumption of observations, which may be too ideal and often violated in real-world applications. To relax the i.i.d. assumption, this article develops the covering number-based concentration estimation to establish generalization bounds of DNNs with tau -mixing samples, where the dependency between samples is much general including alpha-mixing process as a special case. By assigning a specific parameter value to the tau -mixing process, our results are consistent with the existing convergence analysis under the i.i.d. case. Experiments on simulated data validate the theoretical findings.

引用

页数：15

共 59 条

[1] Prediction of time series by statistical learning: general losses and fast rates
Alquier, Pierre
Li, Xiaoyin
Wintenberger, Olivier
[J]. DEPENDENCE MODELING, 2013, 1 (01): : 65 - 93
[2] An HZ, 1996, STAT SINICA, V6, P943
[3] NON-STRONG MIXING AUTOREGRESSIVE PROCESSES
ANDREWS, DWK
[J]. JOURNAL OF APPLIED PROBABILITY, 1984, 21 (04) : 930 - 934
[4] Machine learning panel data regressions with heavy-tailed dependent data: Theory and application
Babii, Andrii
Ball, Ryan T.
Ghysels, Eric
Striaukas, Jonas
[J]. JOURNAL OF ECONOMETRICS, 2023, 237 (02)
[5] Machine Learning Time Series Regressions With an Application to Nowcasting
Babii, Andrii
Ghysels, Eric
Striaukas, Jonas
[J]. JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2022, 40 (03) : 1094 - 1106
[6] Bartlett Peter L., 2017, Advances in Neural Information Processing Systems, V30
[7] ON DEEP LEARNING AS A REMEDY FOR THE CURSE OF DIMENSIONALITY IN NONPARAMETRIC REGRESSION
Bauer, Benedikt
Kohler, Michael
[J]. ANNALS OF STATISTICS, 2019, 47 (04) : 2261 - 2285
[8] Nonparametric relative error estimation of the regression function for left truncated and right censored time series data
Bayarassou, N.
Hamrani, F.
Said, E. Ould
[J]. JOURNAL OF NONPARAMETRIC STATISTICS, 2024, 36 (03) : 706 - 729
[9] Ben Celestin K., 2021, Malaya J. Matematik, V9, P251
[10] Concentration of weakly dependent Banach-valued sums and applications to statistical learning methods
Blanchard, Gilles
Zadorozhnyi, Oleksandr
[J]. BERNOULLI, 2019, 25 (4B) : 3421 - 3458

← 1 2 3 4 5 6 →