Information Theoretical Analysis of Deep Learning Representations

被引：1

作者：

Furusho, Yasutaka ^{[1
]}

Kubo, Takatomi ^{[1
]}

Ikeda, Kazushi ^{[1
]}

机构：

[1] Nara Inst Sci & Technol, Ikoma, Nara 6300192, Japan

来源：

NEURAL INFORMATION PROCESSING, PT I | 2015年 / 9489卷

关键词：

Entropy; Conditional entropy; Mutual information;

D O I：

10.1007/978-3-319-26532-2_66

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although deep learning shows high performance in pattern recognition and machine learning, the reasons are little clarified. To tackle this problem, we calculated the information theoretical variables of representations in hidden layers and analyzed their relationship to the performance. We found that the entropy and the mutual information decrease in a different way as the layer gets deeper. This suggests that the information theoretical variables may become a criterion to determine the number of layers in deep learning.

引用

页码：599 / 605

页数：7

共 11 条

[1] [Anonymous], 2011, Proceedings of the 28th International Conference on International Conference on Machine Learning
[2] Bengio Y., 2006, Advances in Neural Information Processing Systems, V19, P153
[3] Erhan D, 2010, J MACH LEARN RES, V11, P625
[4] Fukumizu K., 2003, NIPS, V15, P857
[5] Geoffrey EHinton., 2012, Improving neural networks by preventing co-adaptation of feature detectors
[6] Hinton G. E., 2006, NEURAL COMPUT, V12, P531
[7] Gradient-based learning applied to document recognition
Lecun, Y
Bottou, L
Bengio, Y
Haffner, P
[J]. PROCEEDINGS OF THE IEEE, 1998, 86 (11) : 2278 - 2324
[8] Seide F., 2011, P INT C FLOR IT 27 3, P437
[9] Vincent P, 2010, J MACH LEARN RES, V11, P3371
[10] Algebraic analysis for nonidentifiable learning machines
Watanabe, S
[J]. NEURAL COMPUTATION, 2001, 13 (04) : 899 - 933

← 1 2 →