Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network

被引:2278
|
作者
Sherstinsky, Alex
机构
关键词
RNN; RNN unfolding/unrolling; LSTM; External input gate; Convolutional input context windows; BACKPROPAGATION;
D O I
10.1016/j.physd.2019.132306
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Because of their effectiveness in broad practical applications, LSTM networks have received a wealth of coverage in scientific journals, technical blogs, and implementation guides. However, in most articles, the inference formulas for the LSTM network and its parent, RNN, are stated axiomatically, while the training formulas are omitted altogether. In addition, the technique of "unrolling'' an RNN is routinely presented without justification throughout the literature. The goal of this tutorial is to explain the essential RNN and LSTM fundamentals in a single document. Drawing from concepts in Signal Processing, we formally derive the canonical RNN formulation from differential equations. We then propose and prove a precise statement, which yields the RNN unrolling technique. We also review the difficulties with training the standard RNN and address them by transforming the RNN into the "Vanilla LSTM''1 network through a series of logical arguments. We provide all equations pertaining to the LSTM system together with detailed descriptions of its constituent entities. Albeit unconventional, our choice of notation and the method for presenting the LSTM system emphasizes ease of understanding. As part of the analysis, we identify new opportunities to enrich the LSTM system and incorporate these extensions into the Vanilla LSTM network, producing the most general LSTM variant to date. The target reader has already been exposed to RNNs and LSTM networks through numerous available resources and is open to an alternative pedagogical approach. A Machine Learning practitioner seeking guidance for implementing our new augmented LSTM model in software for experimentation and research will find the insights and derivations in this treatise valuable as well. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:28
相关论文
共 50 条
  • [31] Use of Recurrent Neural Network with Long Short-Term Memory for Seepage Prediction at Tarbela Dam, KP, Pakistan
    Ishfaque, Muhammad
    Dai, Qianwei
    ul Haq, Nuhman
    Jadoon, Khanzaib
    Shahzad, Syed Muzyan
    Janjuhah, Hammad Tariq
    ENERGIES, 2022, 15 (09)
  • [32] Short-term wind speed forecasting based on long short-term memory and improved BP neural network
    Chen, Gonggui
    Tang, Bangrui
    Zeng, Xianjun
    Zhou, Ping
    Kang, Peng
    Long, Hongyu
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 134
  • [33] Simulation of Open Quantum Dynamics with Bootstrap-Based Long Short-Term Memory Recurrent Neural Network
    Lin, Kunni
    Peng, Jiawei
    Gu, Feng Long
    Lan, Zhenggang
    JOURNAL OF PHYSICAL CHEMISTRY LETTERS, 2021, 12 (41): : 10225 - 10234
  • [34] Long Short-Term Memory Spatial Transformer Network
    Feng, Shiyang
    Chen, Tianyue
    Sun, Hao
    PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 239 - 242
  • [35] A hybrid model for heart disease prediction using recurrent neural network and long short term memory
    Bhavekar G.S.
    Goswami A.D.
    International Journal of Information Technology, 2022, 14 (4) : 1781 - 1789
  • [36] Music generation with long short-term memory network
    Yang, Junye
    SECOND IYSF ACADEMIC SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND COMPUTER ENGINEERING, 2021, 12079
  • [37] Short-Term Photovoltaic Power Forecasting Using an LSTM Neural Network
    Hossain, Mohammad Safayet
    Mahmood, Hisham
    2020 IEEE POWER & ENERGY SOCIETY INNOVATIVE SMART GRID TECHNOLOGIES CONFERENCE (ISGT), 2020,
  • [38] Recurrent neural network (RNN) and long short-term memory neural network (LSTM) based data-driven methods for identifying cohesive zone law parameters of nickel-modified carbon nanotube reinforced sintered nano-silver adhesives
    Dai, Yanwei
    Wei, Jiahui
    Qin, Fei
    MATERIALS TODAY COMMUNICATIONS, 2024, 39
  • [39] Remaining Useful Life Prediction Method Based on Convolutional Neural Network and Long Short-Term Memory Neural Network
    Zhao, Kaisheng
    Zhang, Jing
    Chen, Shaowei
    Wen, Pengfei
    Ping, Wang
    Zhao, Shuai
    2023 PROGNOSTICS AND HEALTH MANAGEMENT CONFERENCE, PHM, 2023, : 336 - 343
  • [40] COMPARATIVE STUDY OF CONVOLUTIONAL NEURAL NETWORK AND LONG SHORT-TERM MEMORY NETWORK FOR SOLAR IRRADIANCE FORECASTING
    Behera, Sasmita
    Bhoi, Sapnil S.
    Mishra, Asutosh
    Nayak, Silon S.
    Panda, Subrat K.
    Patnaik, Soumik S.
    JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2022, 17 (03): : 1845 - 1856