Parallel learning by multitasking neural networks

被引:3
作者
Agliari, Elena [1 ]
Alessandrelli, Andrea [2 ,5 ]
Barra, Adriano [3 ,5 ]
Ricci-Tersenghi, Federico [4 ,5 ,6 ]
机构
[1] Sapienza Univ Roma, Dipartimento Matemat, Piazzale Aldo Moro 5, I-00185 Rome, Italy
[2] Univ Pisa, Dipartimento Informat, Lungarno Antonio Pacinotti 43, I-56126 Pisa, Italy
[3] Univ Salento, Dipartimento Matemat & Fis, Via Arnesano, I-73100 Lecce, Italy
[4] Sapienza Univ Roma, Dipartimento Fis, Piazzale Aldo Moro 2, I-00185 Rome, Italy
[5] Ist Nazl Fis Nucl, Sez Roma1 & Lecce, Lecce, Italy
[6] CNR, Nanotec, Rome Unit, I-00185 Rome, Italy
关键词
machine learning; computational neuroscience; optimization over networks; systems neuroscience; MODEL;
D O I
10.1088/1742-5468/ad0a86
中图分类号
O3 [力学];
学科分类号
08 ; 0801 ;
摘要
Parallel learning, namely the simultaneous learning of multiple patterns, constitutes a modern challenge for neural networks. While this cannot be accomplished by standard Hebbian associative neural networks, in this paper we show how the multitasking Hebbian network (a variation on the theme of the Hopfield model, working on sparse datasets) is naturally able to perform this complex task. We focus on systems processing in parallel a finite (up to logarithmic growth in the size of the network) number of patterns, mirroring the low-storage setting of standard associative neural networks. When patterns to be reconstructed are mildly diluted, the network handles them hierarchically, distributing the amplitudes of their signals as power laws w.r.t. the pattern information content (hierarchical regime), while, for strong dilution, the signals pertaining to all the patterns are simultaneously raised with the same strength (parallel regime). Further, we prove that the training protocol (either supervised or unsupervised) neither alters the multitasking performances nor changes the thresholds for learning. We also highlight (analytically and by Monte Carlo simulations) that a standard cost function (i.e. the Hamiltonian) used in statistical mechanics exhibits the same minima as a standard loss function (i.e. the sum of squared errors) used in machine learning.
引用
收藏
页数:38
相关论文
共 35 条
[31]   Barriers and dynamical paths in alternating Gibbs sampling of restricted Boltzmann machines [J].
Roussel, Clement ;
Cocco, Simona ;
Monasson, Remi .
PHYSICAL REVIEW E, 2021, 104 (03)
[32]   STATISTICAL-MECHANICS OF LEARNING FROM EXAMPLES [J].
SEUNG, HS ;
SOMPOLINSKY, H ;
TISHBY, N .
PHYSICAL REVIEW A, 1992, 45 (08) :6056-6091
[33]   Extensive Parallel Processing on Scale-Free Networks [J].
Sollich, Peter ;
Tantari, Daniele ;
Annibale, Alessia ;
Barra, Adriano .
PHYSICAL REVIEW LETTERS, 2014, 113 (23)
[34]   Rigorous results for the Hopfield model with many patterns [J].
Talagrand, M .
PROBABILITY THEORY AND RELATED FIELDS, 1998, 110 (02) :177-276
[35]   Emergence of Compositional Representations in Restricted Boltzmann Machines [J].
Tubiana, J. ;
Monasson, R. .
PHYSICAL REVIEW LETTERS, 2017, 118 (13)