Statistical mechanics of unsupervised feature learning in a restricted Boltzmann machine with binary synapses

被引:25
作者
Huang, Haiping [1 ]
机构
[1] RIKEN Brain Sci Inst, Wako, Saitama 3510198, Japan
关键词
cavity and replica method; statistical inference; learning theory; neuronal networks; SPIN-GLASS; NEURAL-NETWORKS; ALGORITHMS; MODEL; INFERENCE;
D O I
10.1088/1742-5468/aa6ddc
中图分类号
O3 [力学];
学科分类号
08 ; 0801 ;
摘要
Revealing hidden features in unlabeled data is called unsupervised feature learning, which plays an important role in pretraining a deep neural network. Here we provide a statistical mechanics analysis of the unsupervised learning in a restricted Boltzmann machine with binary synapses. A message passing equation to infer the hidden feature is derived, and furthermore, variants of this equation are analyzed. A statistical analysis by replica theory describes the thermodynamic properties of the model. Our analysis confirms an entropy crisis preceding the non-convergence of the message passing equation, suggesting a discontinuous phase transition as a key characteristic of the restricted Boltzmann machine. Continuous phase transition is also confirmed depending on the embedded feature strength in the data. The mean-field result under the replica symmetric assumption agrees with that obtained by running message passing algorithms on single instances of finite sizes. Interestingly, in an approximate Hopfield model, the entropy crisis is absent, and a continuous phase transition is observed instead. We also develop an iterative equation to infer the hyper-parameter (temperature) hidden in the data, which in physics corresponds to iteratively imposing Nishimori condition. Our study provides insights towards understanding the thermodynamic properties of the restricted Boltzmann machine learning, and moreover important theoretical basis to build simplified deep networks.
引用
收藏
页数:25
相关论文
共 29 条
[1]   Multitasking Associative Networks [J].
Agliari, Elena ;
Barra, Adriano ;
Galluzzi, Andrea ;
Guerra, Francesco ;
Moauro, Francesco .
PHYSICAL REVIEW LETTERS, 2012, 109 (26)
[2]   STORING INFINITE NUMBERS OF PATTERNS IN A SPIN-GLASS MODEL OF NEURAL NETWORKS [J].
AMIT, DJ ;
GUTFREUND, H ;
SOMPOLINSKY, H .
PHYSICAL REVIEW LETTERS, 1985, 55 (14) :1530-1533
[3]  
[Anonymous], 2015, P 32 INT C MACH LEAR
[4]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[5]   An Iterative Construction of Solutions of the TAP Equations for the Sherrington-Kirkpatrick Model [J].
Bolthausen, Erwin .
COMMUNICATIONS IN MATHEMATICAL PHYSICS, 2014, 325 (01) :333-366
[6]   Spin-glass theory for pedestrians [J].
Castellani, T ;
Cavagna, A .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2005, :215-266
[7]  
Courbariaux Matthieu, 2015, CoRR
[8]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[9]   Message-passing algorithms for compressed sensing [J].
Donoho, David L. ;
Maleki, Arian ;
Montanari, Andrea .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (45) :18914-18919
[10]   SPIN-GLASSES WITH P-SPIN INTERACTIONS [J].
GARDNER, E .
NUCLEAR PHYSICS B, 1985, 257 (06) :747-765