The nature of unsupervised learning in deep neural networks: A new understanding and novel approach

被引:0
作者
Golovko V. [1 ,2 ]
Kroshchanka A. [1 ]
Treadwell D. [3 ]
机构
[1] Brest State Technical University, Belarus
[2] National Research Nuclear University (MEPHI), Moscow
[3] 5339 Iron Horse Pkwy, Dublin, 94568, CA
来源
Optical Memory and Neural Networks (Information Optics) | 2016年 / 25卷 / 03期
关键词
data visualization; deep learning; deep neural networks; machine learning; restricted Boltzmann machine;
D O I
10.3103/S1060992X16030073
中图分类号
学科分类号
摘要
Over the last decade, the deep neural networks are a hot topic in machine learning. It is breakthrough technology in processing images, video, speech, text and audio. Deep neural network permits us to overcome some limitations of a shallow neural network due to its deep architecture. In this paper we investigate the nature of unsupervised learning in restricted Boltzmann machine. We have proved that maximization of the log-likelihood input data distribution of restricted Boltzmann machine is equivalent to minimizing the cross-entropy and to special case of minimizing the mean squared error. Thus the nature of unsupervised learning is invariant to different training criteria. As a result we propose a new technique called “REBA” for the unsupervised training of deep neural networks. In contrast to Hinton’s conventional approach to the learning of restricted Boltzmann machine, which is based on linear nature of training rule, the proposed technique is founded on nonlinear training rule. We have shown that the classical equations for RBM learning are a special case of the proposed technique. As a result the proposed approach is more universal in contrast to the traditional energy-based model. We demonstrate the performance of the REBA technique using wellknown benchmark problem. The main contribution of this paper is a novel view and new understanding of an unsupervised learning in deep neural networks. © 2016, Allerton Press, Inc.
引用
收藏
页码:127 / 141
页数:14
相关论文
共 22 条
[1]  
Hinton G., Osindero S., Teh Y., A fast learning algorithm for deep belief nets, Neural Computation, 18, pp. 1527-1554, (2006)
[2]  
Hinton G., Training products of experts by minimizing contrastive divergence, Neural Computation, 14, pp. 1771-1800, (2002)
[3]  
Hinton G., Salakhutdinov R., Reducing the dimensionality of data with neural networks, Science, 313, 5786, pp. 504-507, (2006)
[4]  
Hinton G.E., A practical guide to training restricted Boltzmann machines, Tech. Rep. 2010-000, (2010)
[5]  
Krizhevsky A., Sutskever L., Hinton G., ImageNet classification with deep convolutional neural networs, Proc. Advances in Neural information Processing Systems, 25, pp. 1090-1098, (2012)
[6]  
LeCun Y., Bengio Y., Hinton G., Deep Learning Nature, 521, 7553, pp. 436-444, (2015)
[7]  
Mikolov T., Deoras A., Povey D., Burget L., Cernocky J., Strategies for training large scale neural network language models, in Automatic Speech Recognition and Understanding, pp. 195-201, (2011)
[8]  
Hinton G., Et al., Deep neural network for acoustic modeling in speech recognition, Proc. IEEE Signal Processing Magazine, 29, pp. 82-97, (2012)
[9]  
Bengio Y., Learning deep architectures for AI, Foundations and Trends in Machine Learning, 2, 1, pp. 1-127, (2009)
[10]  
Bengio Y., Lamblin P., Popovici D., Larochelle H., Et al., Greedy layer-wise training of deep networks, Advances in Neural Information Processing Systems, 19, (2007)