The emergence of a concept in shallow neural networks

被引:33
作者
Agliari, Elena [1 ]
Alemanno, Francesco [2 ]
Barra, Adriano [2 ,3 ]
De Marzo, Giordano [4 ,5 ]
机构
[1] Sapienza Univ Roma, Dipartimento Matemat, P le A Moro 5, I-00185 Rome, Italy
[2] Univ Salento, Dipartimento Matemat & Fis, Campus Ecotekne,via Monteroni, I-73100 Lecce, Italy
[3] Ist Nazl Fis Nucl, Sez Lecce, Campus Ecotekne,via Monteroni, I-73100 Lecce, Italy
[4] Sapienza Univ Roma, Dipartimento Fis, P le A Moro 5, I-00185 Rome, Italy
[5] Ctr Ric Enrico Fermi, Via Panisperna 89a, I-00184 Rome, Italy
关键词
Neural networks; Machine learning; Glassy statistical mechanics; PATTERNS;
D O I
10.1016/j.neunet.2022.01.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider restricted Boltzmann machine (RBMs) trained over an unstructured dataset made of blurred copies of definite but unavailable "archetypes " and we show that there exists a critical sample size beyond which the RBM can learn archetypes, namely the machine can successfully play as a generative model or as a classifier, according to the operational routine. In general, assessing a critical sample size (possibly in relation to the quality of the dataset) is still an open problem in machine learning. Here, restricting to the random theory, where shallow networks suffice and the "grandmother-cell " scenario is correct, we leverage the formal equivalence between RBMs and Hopfield networks, to obtain a phase diagram for both the neural architectures which highlights regions, in the space of the control parameters (i.e., number of archetypes, number of neurons, size and quality of the training set), where learning can be accomplished. Our investigations are led by analytical methods based on the statistical-mechanics of disordered systems and results are further corroborated by extensive Monte Carlo simulations. (C) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页码:232 / 253
页数:22
相关论文
共 39 条
[1]  
ACKLEY DH, 1985, COGNITIVE SCI, V9, P147
[2]   A transport equation approach for deep neural networks with quenched random weights [J].
Agliari, E. ;
Albanese, L. ;
Alemanno, F. ;
Fachechi, A. .
JOURNAL OF PHYSICS A-MATHEMATICAL AND THEORETICAL, 2021, 54 (50)
[3]   Machine learning and statistical physics: preface [J].
Agliari, Elena ;
Barra, Adriano ;
Sollich, Peter ;
Zdeborova, Lenka .
JOURNAL OF PHYSICS A-MATHEMATICAL AND THEORETICAL, 2020, 53 (50)
[4]   Replica symmetry breaking in neural networks: a few steps toward rigorous results [J].
Agliari, Elena ;
Albanese, Linda ;
Barra, Adriano ;
Ottaviani, Gabriele .
JOURNAL OF PHYSICS A-MATHEMATICAL AND THEORETICAL, 2020, 53 (41)
[5]   Generalized Guerra's interpolation schemes for dense associative neural networks [J].
Agliari, Elena ;
Alemanno, Francesco ;
Barra, Adriano ;
Fachechi, Alberto .
NEURAL NETWORKS, 2020, 128 :254-267
[6]   Neural Networks with a Redundant Representation: Detecting the Undetectable [J].
Agliari, Elena ;
Alemanno, Francesco ;
Barra, Adriano ;
Centonze, Martino ;
Fachechi, Alberto .
PHYSICAL REVIEW LETTERS, 2020, 124 (02)
[7]   Neural Networks Retrieving Boolean Patterns in a Sea of Gaussian Ones [J].
Agliari, Elena ;
Barra, Adriano ;
Longo, Chiara ;
Tantari, Daniele .
JOURNAL OF STATISTICAL PHYSICS, 2017, 168 (05) :1085-1104
[8]   Multitasking Associative Networks [J].
Agliari, Elena ;
Barra, Adriano ;
Galluzzi, Andrea ;
Guerra, Francesco ;
Moauro, Francesco .
PHYSICAL REVIEW LETTERS, 2012, 109 (26)
[9]  
Amit D. J., 1989, Modeling Brain Function: The World of Attractor Neural Networks
[10]   STORING INFINITE NUMBERS OF PATTERNS IN A SPIN-GLASS MODEL OF NEURAL NETWORKS [J].
AMIT, DJ ;
GUTFREUND, H ;
SOMPOLINSKY, H .
PHYSICAL REVIEW LETTERS, 1985, 55 (14) :1530-1533