Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network

被引:2278
|
作者
Sherstinsky, Alex
机构
关键词
RNN; RNN unfolding/unrolling; LSTM; External input gate; Convolutional input context windows; BACKPROPAGATION;
D O I
10.1016/j.physd.2019.132306
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Because of their effectiveness in broad practical applications, LSTM networks have received a wealth of coverage in scientific journals, technical blogs, and implementation guides. However, in most articles, the inference formulas for the LSTM network and its parent, RNN, are stated axiomatically, while the training formulas are omitted altogether. In addition, the technique of "unrolling'' an RNN is routinely presented without justification throughout the literature. The goal of this tutorial is to explain the essential RNN and LSTM fundamentals in a single document. Drawing from concepts in Signal Processing, we formally derive the canonical RNN formulation from differential equations. We then propose and prove a precise statement, which yields the RNN unrolling technique. We also review the difficulties with training the standard RNN and address them by transforming the RNN into the "Vanilla LSTM''1 network through a series of logical arguments. We provide all equations pertaining to the LSTM system together with detailed descriptions of its constituent entities. Albeit unconventional, our choice of notation and the method for presenting the LSTM system emphasizes ease of understanding. As part of the analysis, we identify new opportunities to enrich the LSTM system and incorporate these extensions into the Vanilla LSTM network, producing the most general LSTM variant to date. The target reader has already been exposed to RNNs and LSTM networks through numerous available resources and is open to an alternative pedagogical approach. A Machine Learning practitioner seeking guidance for implementing our new augmented LSTM model in software for experimentation and research will find the insights and derivations in this treatise valuable as well. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:28
相关论文
共 50 条
  • [41] A long short-term memory-fully connected (LSTM-FC) neural network for predicting the incidence of bronchopneumonia in children
    Zhao, Dongzhe
    Chen, Min
    Shi, Kaifang
    Ma, Mingguo
    Huang, Yang
    Shen, Jingwei
    ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2021, 28 (40) : 56892 - 56905
  • [42] Human activity classification using long short-term memory network
    Welhenge, Anuradhi Malshika
    Taparugssanagorn, Attaphongse
    SIGNAL IMAGE AND VIDEO PROCESSING, 2019, 13 (04) : 651 - 656
  • [43] Human activity classification using long short-term memory network
    Anuradhi Malshika Welhenge
    Attaphongse Taparugssanagorn
    Signal, Image and Video Processing, 2019, 13 : 651 - 656
  • [44] A long short-term memory-fully connected (LSTM-FC) neural network for predicting the incidence of bronchopneumonia in children
    Dongzhe Zhao
    Min Chen
    Kaifang Shi
    Mingguo Ma
    Yang Huang
    Jingwei Shen
    Environmental Science and Pollution Research, 2021, 28 : 56892 - 56905
  • [45] A Multivariate Long Short-Term Memory Neural Network for Coalbed Methane Production Forecasting
    Xu, Xijie
    Rui, Xiaoping
    Fan, Yonglei
    Yu, Tian
    Ju, Yiwen
    SYMMETRY-BASEL, 2020, 12 (12): : 1 - 15
  • [46] Lossless Image Compression Algorithm Based on Long Short-term Memory Neural Network
    Zhu, Caixin
    Zhang, Huaiyao
    Tang, Yun
    2020 5TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND APPLICATIONS (ICCIA 2020), 2020, : 82 - 88
  • [47] Instruction SDC Vulnerability Prediction Using Long Short-Term Memory Neural Network
    Liu, Yunfei
    Li, Jing
    Zhuang, Yi
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2018, 2018, 11323 : 140 - 149
  • [48] Image Description Generator using Residual Neural Network and Long Short-Term Memory
    Morampudi, Mahesh Kumar
    Gonthina, Nagamani
    Bhaskar, Nuthanakanti
    Reddy, V. Dinesh
    COMPUTER SCIENCE JOURNAL OF MOLDOVA, 2023, 31 (01) : 3 - 21
  • [49] Driver Profiling Using Long Short Term Memory (LSTM) and Convolutional Neural Network (CNN) Methods
    Cura, Aslihan
    Kucuk, Haluk
    Ergen, Erdem
    Oksuzoglu, Ismail Burak
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (10) : 6572 - 6582
  • [50] Research on Predictive Maintenance of Aircraft Based on Long Short-Term Memory Neural Network
    Lee, Chin-Hsiung
    Lee, Chih-Yu
    2022 ASIA CONFERENCE ON ADVANCED ROBOTICS, AUTOMATION, AND CONTROL ENGINEERING (ARACE 2022), 2022, : 150 - 154