Connecting weighted automata, tensor networks and recurrent neural networks through spectral learning

被引：1

作者：

Li, Tianyu ^{[1
]}

Precup, Doina ^{[2
]}

Rabusseau, Guillaume ^{[3
,4
]}

机构：

[1] McGill Univ, Mila, Montreal, PQ, Canada

[2] McGill Univ, CCAI Chair Mila, Montreal, PQ, Canada

[3] Univ Montreal, CCAI Chair Mila, Montreal, PQ, Canada

[4] Univ Montreal, DIRO, Montreal, PQ, Canada

来源：

MACHINE LEARNING | 2024年 / 113卷 / 05期

关键词：

Weighted automata; Spectral learning; Recurrent neural networks; Tensor networks; Tensor train decomposition; FINITE-STATE AUTOMATA; MATRIX RECOVERY;

D O I：

10.1007/s10994-022-06164-1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we present connections between three models used in different research fields: weighted finite automata (WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks which encompasses a set of optimization techniques for high-order tensors used in quantum physics and numerical analysis. We first present an intrinsic relation between WFA and the tensor train decomposition, a particular form of tensor network. This relation allows us to exhibit a novel low rank structure of the Hankel matrix of a function computed by a WFA and to design an efficient spectral learning algorithm leveraging this structure to scale the algorithm up to very large Hankel matrices. We then unravel a fundamental connection between WFA and second-order recurrent neural networks (2-RNN): in the case of sequences of discrete symbols, WFA and 2-RNN with linear activation functions are expressively equivalent. Leveraging this equivalence result combined with the classical spectral learning algorithm for weighted automata, we introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous input vectors. This algorithm relies on estimating low rank sub-blocks of the Hankel tensor, from which the parameters of a linear 2-RNN can be provably recovered. The performances of the proposed learning algorithm are assessed in a simulation study on both synthetic and real-world data.

引用

页码：2619 / 2653

页数：35

共 50 条

[31] Modeling Word Learning and Processing with Recurrent Neural Networks
Marzi, Claudia
INFORMATION, 2020, 11 (06)
[32] On-line identification and reconstruction of finite automata with generalized recurrent neural networks
Gabrijel, I
Dobnikar, A
NEURAL NETWORKS, 2003, 16 (01) : 101 - 120
[33] State-Regularized Recurrent Neural Networks to Extract Automata and Explain Predictions
Wang, Cheng
Lawrence, Carolin
Niepert, Mathias
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 7739 - 7750
[34] Equivalence in knowledge representation: Automata, recurrent neural networks, and dynamical fuzzy systems
Giles, CL
Omlin, CW
Thornber, KK
PROCEEDINGS OF THE IEEE, 1999, 87 (09) : 1623 - 1640
[35] On the improvement of the real time recurrent learning algorithm for recurrent neural networks
Mak, MW
Ku, KW
Lu, YL
NEUROCOMPUTING, 1999, 24 (1-3) : 13 - 36
[36] Tensor-Train Recurrent Neural Networks for Interpretable Multi-Way Financial Forecasting
Xu, Yao Lei
Calvi, Giuseppe G.
Mandic, Danilo P.
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[37] Recognizing recurrent neural networks (rRNN): Bayesian inference for recurrent neural networks
Sebastian Bitzer
Stefan J. Kiebel
Biological Cybernetics, 2012, 106 : 201 - 217
[38] Recognizing recurrent neural networks (rRNN): Bayesian inference for recurrent neural networks
Bitzer, Sebastian
Kiebel, Stefan J.
BIOLOGICAL CYBERNETICS, 2012, 106 (4-5) : 201 - 217
[39] Spectral Methods from Tensor Networks
Moitra, Ankur
Wein, Alexander S.
PROCEEDINGS OF THE 51ST ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING (STOC '19), 2019, : 926 - 937
[40] SPECTRAL METHODS FROM TENSOR NETWORKS
Moitra, Ankur
Weinddagger, Alexander S.
SIAM JOURNAL ON COMPUTING, 2023, 52 (02) : 354 - 384

← 1 2 3 4 5 →