Improving generalization of deep neural networks by leveraging margin distribution

被引：9

作者：

Lyu, Shen-Huan ^{[1
]}

Wang, Lu ^{[1
]}

Zhou, Zhi-Hua ^{[1
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China

来源：

NEURAL NETWORKS | 2022年 / 151卷

关键词：

Deep neural network; Margin theory; Generalization;

D O I：

10.1016/j.neunet.2022.03.019

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent research has used margin theory to analyze the generalization performance for deep neural networks (DNNs). The existed results are almost based on the spectrally-normalized minimum margin. However, optimizing the minimum margin ignores a mass of information about the entire margin distribution, which is crucial to generalization performance. In this paper, we prove a generalization upper bound dominated by the statistics of the entire margin distribution. Compared with the minimum margin bounds, our bound highlights an important measure for controlling the complexity, which is the ratio of the margin standard deviation to the expected margin. We utilize a convex margin distribution loss function on the deep neural networks to validate our theoretical results by optimizing the margin ratio. Experiments and visualizations confirm the effectiveness of our approach and the correlation between generalization gap and margin ratio. (c) 2022 Elsevier Ltd. All rights reserved.

引用

页码：48 / 60

页数：13

共 68 条

[1] [Anonymous], 2013, Advances in Neural Information Processing Systems
[2] Arora S, 2018, PR MACH LEARN RES, V80
[3] Azulay A, 2019, J MACH LEARN RES, V20
[4] Baldi P., 2013, Advances in Neural Information Processing Systems, V2, P2814
[5] Almost linear VC-dimension bounds for piecewise polynomial networks
Bartlett, PL
Maiorov, V
[J]. NEURAL COMPUTATION, 1998, 10 (08) : 2159 - 2173
[6] Bartlett PL, 2017, 31 ANN C NEURAL INFO, V30
[7] Prediction games and arcing algorithms
Breiman, L
[J]. NEURAL COMPUTATION, 1999, 11 (07) : 1493 - 1517
[8] PCANet: A Simple Deep Learning Baseline for Image Classification?
Chan, Tsung-Han
Jia, Kui
Gao, Shenghua
Lu, Jiwen
Zeng, Zinan
Ma, Yi
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) : 5017 - 5032
[9] Chen Yuansi, 2018, CORRABS180401619
[10] SUPPORT-VECTOR NETWORKS
CORTES, C
VAPNIK, V
[J]. MACHINE LEARNING, 1995, 20 (03) : 273 - 297

← 1 2 3 4 5 6 7 →