Generalization Comparison of Deep Neural Networks via Output Sensitivity

被引：7

作者：

Forouzesh, Mahsa ^{[1
]}

Salehi, Farnood ^{[2
]}

Thiran, Patrick ^{[1
]}

机构：

[1] Ecole Polytech Fed Lausanne, Lausanne, Switzerland

[2] DisneyRes Studios, Zurich, Switzerland

来源：

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2021年

关键词：

deep neural networks; generalization; sensitivity; bias-variance decomposition;

D O I：

10.1109/ICPR48806.2021.9412496

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although recent works have brought some insights into the performance improvement of techniques used in state-of-the-art deep-learning models, more work is needed to understand their generalization properties. We shed light on this matter by linking the loss function to the output's sensitivity to its input. We find a rather strong empirical relation between the output sensitivity and the variance in the bias-variance decomposition of the loss function, which hints on using sensitivity as a metric for comparing the generalization performance of networks, without requiring labeled data. We find that sensitivity is decreased by applying popular methods which improve the generalization performance of the model, such as (1) using a deep network rather than a wide one, (2) adding convolutional layers to baseline classifiers instead of adding fully-connected layers, (3) using batch normalization, dropout and max-pooling, and (4) applying parameter initialization techniques.

引用

页码：7411 / 7418

页数：8

共 47 条

[1]

[Anonymous], Conf. Comput. Vis. (ICCV)

[2]

[Anonymous], 1993, ICANN 93, DOI DOI 10.1007/978-1-4471-2063-6214

[3]

[Anonymous], 2018, ADV NEURAL INFORM PR

[4]

[Anonymous], 2000, PROC INT C MACH LEAR

[5]

[Anonymous], 2018, ARXIV180600179

[6]

Arora S, 2018, PR MACH LEARN RES, V80

[7]

Ba LJ, 2014, ADV NEUR IN, V27

[8]

Bartlett Peter L., 2017, Advances in Neural Information Processing Systems, V30

[9]

Bengio Y, 2011, LECT NOTES ARTIF INT, V6925, P18, DOI 10.1007/978-3-642-24412-4_3

[10]

Chatterji N. S., 2019, ARXIV191200528

← 1 2 3 4 5 →