Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network

被引:2278
|
作者
Sherstinsky, Alex
机构
关键词
RNN; RNN unfolding/unrolling; LSTM; External input gate; Convolutional input context windows; BACKPROPAGATION;
D O I
10.1016/j.physd.2019.132306
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Because of their effectiveness in broad practical applications, LSTM networks have received a wealth of coverage in scientific journals, technical blogs, and implementation guides. However, in most articles, the inference formulas for the LSTM network and its parent, RNN, are stated axiomatically, while the training formulas are omitted altogether. In addition, the technique of "unrolling'' an RNN is routinely presented without justification throughout the literature. The goal of this tutorial is to explain the essential RNN and LSTM fundamentals in a single document. Drawing from concepts in Signal Processing, we formally derive the canonical RNN formulation from differential equations. We then propose and prove a precise statement, which yields the RNN unrolling technique. We also review the difficulties with training the standard RNN and address them by transforming the RNN into the "Vanilla LSTM''1 network through a series of logical arguments. We provide all equations pertaining to the LSTM system together with detailed descriptions of its constituent entities. Albeit unconventional, our choice of notation and the method for presenting the LSTM system emphasizes ease of understanding. As part of the analysis, we identify new opportunities to enrich the LSTM system and incorporate these extensions into the Vanilla LSTM network, producing the most general LSTM variant to date. The target reader has already been exposed to RNNs and LSTM networks through numerous available resources and is open to an alternative pedagogical approach. A Machine Learning practitioner seeking guidance for implementing our new augmented LSTM model in software for experimentation and research will find the insights and derivations in this treatise valuable as well. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:28
相关论文
共 50 条
  • [21] Design of a Soft Sensor Based on Long Short-Term Memory Artificial Neural Network (LSTM) for Wastewater Treatment Plants
    Recio-Colmenares, Roxana
    Becerril, Elizabeth Leon
    Tun, Kelly Joel Gurubel
    Conchas, Robin F.
    SENSORS, 2023, 23 (22)
  • [22] Exploring Temporal Dynamics of River Discharge Using Univariate Long Short-Term Memory (LSTM) Recurrent Neural Network at East Branch of Delaware River
    Mehedi, Md Abdullah Al
    Khosravi, Marzieh
    Yazdan, Munshi Md Shafwat
    Shabanian, Hanieh
    HYDROLOGY, 2022, 9 (11)
  • [23] Modeling of Scroll Expander Based on Long Short-Term Memory Neural Network
    Zheng, Tianyou
    Li, Ke
    Ma, Xin
    Qu, Chao
    Zhang, Chenghui
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 732 - 736
  • [24] Convolutional long short-term memory neural network for groundwater change prediction
    Patra, Sumriti Ranjan
    Chu, Hone-Jay
    FRONTIERS IN WATER, 2024, 6
  • [25] T-LSTM: A Long Short-Term Memory Neural Network Enhanced by Temporal Information for Traffic Flow Prediction
    Mou, Luntian
    Zhao, Pengfei
    Xie, Haitao
    Chen, Yanyan
    IEEE ACCESS, 2019, 7 : 98053 - 98060
  • [26] Recurrent neural network and long short-term memory models for audio copy-move forgery detection: a comprehensive study
    Akdeniz, Fulya
    Becerikli, Yasar
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (12) : 17575 - 17605
  • [27] A hybrid convolutional neural network with long short-term memory for statistical arbitrage
    Eggebrecht, P.
    Luetkebohmert, E.
    QUANTITATIVE FINANCE, 2023, 23 (04) : 595 - 613
  • [28] Classification of multiple cattle behavior patterns using a recurrent neural network with long short-term memory and inertial measurement units
    Peng, Yingqi
    Kondo, Naoshi
    Fujiura, Tateshi
    Suzuki, Tetsuhito
    Wulandari
    Yoshioka, Hidetsugu
    Itoyama, Erina
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2019, 157 : 247 - 253
  • [29] Driving Intention Identification Based on Long Short-Term Memory Neural Network
    Liu, Yonggang
    Zhao, Pan
    Qin, Datong
    Yang, Yang
    Chen, Zheng
    2019 IEEE VEHICLE POWER AND PROPULSION CONFERENCE (VPPC), 2019,
  • [30] Evolving long short-term memory neural network for wind speed forecasting
    Huang, Cong
    Karimi, Hamid Reza
    Mei, Peng
    Yang, Daoguang
    Shi, Quan
    INFORMATION SCIENCES, 2023, 632 : 390 - 410