Information Theoretical Analysis of Deep Learning Representations

被引:1
作者
Furusho, Yasutaka [1 ]
Kubo, Takatomi [1 ]
Ikeda, Kazushi [1 ]
机构
[1] Nara Inst Sci & Technol, Ikoma, Nara 6300192, Japan
来源
NEURAL INFORMATION PROCESSING, PT I | 2015年 / 9489卷
关键词
Entropy; Conditional entropy; Mutual information;
D O I
10.1007/978-3-319-26532-2_66
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although deep learning shows high performance in pattern recognition and machine learning, the reasons are little clarified. To tackle this problem, we calculated the information theoretical variables of representations in hidden layers and analyzed their relationship to the performance. We found that the entropy and the mutual information decrease in a different way as the layer gets deeper. This suggests that the information theoretical variables may become a criterion to determine the number of layers in deep learning.
引用
收藏
页码:599 / 605
页数:7
相关论文
共 11 条
  • [1] [Anonymous], 2011, Proceedings of the 28th International Conference on International Conference on Machine Learning
  • [2] Bengio Y., 2006, Advances in Neural Information Processing Systems, V19, P153
  • [3] Erhan D, 2010, J MACH LEARN RES, V11, P625
  • [4] Fukumizu K., 2003, NIPS, V15, P857
  • [5] Geoffrey EHinton., 2012, Improving neural networks by preventing co-adaptation of feature detectors
  • [6] Hinton G. E., 2006, NEURAL COMPUT, V12, P531
  • [7] Gradient-based learning applied to document recognition
    Lecun, Y
    Bottou, L
    Bengio, Y
    Haffner, P
    [J]. PROCEEDINGS OF THE IEEE, 1998, 86 (11) : 2278 - 2324
  • [8] Seide F., 2011, P INT C FLOR IT 27 3, P437
  • [9] Vincent P, 2010, J MACH LEARN RES, V11, P3371
  • [10] Algebraic analysis for nonidentifiable learning machines
    Watanabe, S
    [J]. NEURAL COMPUTATION, 2001, 13 (04) : 899 - 933