Connecting weighted automata, tensor networks and recurrent neural networks through spectral learning

被引:1
|
作者
Li, Tianyu [1 ]
Precup, Doina [2 ]
Rabusseau, Guillaume [3 ,4 ]
机构
[1] McGill Univ, Mila, Montreal, PQ, Canada
[2] McGill Univ, CCAI Chair Mila, Montreal, PQ, Canada
[3] Univ Montreal, CCAI Chair Mila, Montreal, PQ, Canada
[4] Univ Montreal, DIRO, Montreal, PQ, Canada
关键词
Weighted automata; Spectral learning; Recurrent neural networks; Tensor networks; Tensor train decomposition; FINITE-STATE AUTOMATA; MATRIX RECOVERY;
D O I
10.1007/s10994-022-06164-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present connections between three models used in different research fields: weighted finite automata (WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks which encompasses a set of optimization techniques for high-order tensors used in quantum physics and numerical analysis. We first present an intrinsic relation between WFA and the tensor train decomposition, a particular form of tensor network. This relation allows us to exhibit a novel low rank structure of the Hankel matrix of a function computed by a WFA and to design an efficient spectral learning algorithm leveraging this structure to scale the algorithm up to very large Hankel matrices. We then unravel a fundamental connection between WFA and second-order recurrent neural networks (2-RNN): in the case of sequences of discrete symbols, WFA and 2-RNN with linear activation functions are expressively equivalent. Leveraging this equivalence result combined with the classical spectral learning algorithm for weighted automata, we introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous input vectors. This algorithm relies on estimating low rank sub-blocks of the Hankel tensor, from which the parameters of a linear 2-RNN can be provably recovered. The performances of the proposed learning algorithm are assessed in a simulation study on both synthetic and real-world data.
引用
收藏
页码:2619 / 2653
页数:35
相关论文
共 50 条
  • [21] ARRNET: ACTION RECOGNITION THROUGH RECURRENT NEURAL NETWORKS
    Krishnan, Kumaresh
    Prabhu, Nikita
    Babu, R. Venkatesh
    2016 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2016,
  • [22] Residual Recurrent Neural Networks for Learning Sequential Representations
    Yue, Boxuan
    Fu, Junwei
    Liang, Jun
    INFORMATION, 2018, 9 (03)
  • [23] Semantic learning in autonomously active recurrent neural networks
    Gros, Claudius
    Kaczor, Gregor
    LOGIC JOURNAL OF THE IGPL, 2010, 18 (05) : 686 - 704
  • [24] On the Learning Capabilities of Recurrent Neural Networks: A Cryptographic Perspective
    Srivastava, Shivin
    Bhatia, Ashutosh
    2018 9TH IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (ICBK), 2018, : 162 - 167
  • [25] A conjugate gradient learning algorithm for recurrent neural networks
    Chang, WF
    Mak, MW
    NEUROCOMPUTING, 1999, 24 (1-3) : 173 - 189
  • [26] Effect of complexity on learning ability of recurrent neural networks
    N. Honma
    K. Kitagawa
    K. Abe
    Artificial Life and Robotics, 1998, 2 (3) : 97 - 101
  • [27] Sequence Metric Learning as Synchronization of Recurrent Neural Networks
    Compagnon, Paul
    Lefebvre, Gregoire
    Duffner, Stefan
    Garcia, Christophe
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [28] Learning to Learn and Compositionality with Deep Recurrent Neural Networks
    de Freitas, Nando
    KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 3 - 3
  • [29] Continual learning for recurrent neural networks: An empirical evaluation
    Cossu, Andrea
    Carta, Antonio
    Lomonaco, Vincenzo
    Bacciu, Davide
    NEURAL NETWORKS, 2021, 143 : 607 - 627
  • [30] Learning Password Modification Patterns with Recurrent Neural Networks
    Nosenko, Alex
    Cheng, Yuan
    Chen, Haiquan
    SECURE KNOWLEDGE MANAGEMENT IN THE ARTIFICIAL INTELLIGENCE ERA, 2022, 1549 : 110 - 129