Fast training of deep LSTM networks with guaranteed stability for nonlinear system modeling

被引:32
作者
Yu, Wen [1 ]
Gonzalez, Jesus [1 ]
Li, Xiaoou [2 ]
机构
[1] Natl Polytech Inst, IPN, Dept Control Automat CINVESTAV, Mexico City, DF, Mexico
[2] Natl Polytech Inst, IPN, Dept Comp CINVESTAV, Mexico City, DF, Mexico
关键词
LSTM; Training; BPTT; Stability;
D O I
10.1016/j.neucom.2020.09.030
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep recurrent neural networks (RNN), such as LSTM, have many advantages over forward networks for nonlinear system modeling. However, the most used training method, backward propagation through time (BPTT), is very slow. In this paper, by separating the LSTM cell into forward and recurrent models, we give a faster training method than BPTT. The deep LSTM is modified by combining the deep RNN with the multilayer percep-trons (MLP). The backpropagation-like training methods are proposed for the deep RNN and MLP trainings. The stability of these algorithms are demonstrated. The simulation results show that our fast training methods for LSTM are better than the conventional approaches. (c) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:85 / 94
页数:10
相关论文
共 18 条
[1]  
[Anonymous], 2009, P 15 IFAC S SYST ID
[2]   Review of unsteady transonic aerodynamics: Theory and applications [J].
Bendiksen, Oddvar O. .
PROGRESS IN AEROSPACE SCIENCES, 2011, 47 (02) :135-167
[3]  
Bengio Y., 2007, P ADV NEUR INF PROC, P153
[4]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[5]   A review on neural networks with random weights [J].
Cao, Weipeng ;
Wang, Xizhao ;
Ming, Zhong ;
Gao, Jinzhu .
NEUROCOMPUTING, 2018, 275 :278-287
[6]  
Chung J., 2014, ARXIV
[7]  
Graves A, 2013, IEEE INT C ACOUSTICS
[8]  
Hinton Osindero Osindero G., 2006, NEURAL COMPUT, V18, P1
[9]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
[10]   Gradient-based learning applied to document recognition [J].
Lecun, Y ;
Bottou, L ;
Bengio, Y ;
Haffner, P .
PROCEEDINGS OF THE IEEE, 1998, 86 (11) :2278-2324