Segmented-Memory Recurrent Neural Networks

被引:16
作者
Chen, Jinmiao [1 ]
Chaudhari, Narendra S. [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2009年 / 20卷 / 08期
关键词
Gradient descent; information latching; long-term dependencies; recurrent neural networks (RNNs); segmented memory; vanishing gradient; PROTEIN SECONDARY STRUCTURE; LONG-TERM DEPENDENCIES; STRUCTURE PREDICTION; GRADIENT-DESCENT; REHEARSAL; ALGORITHM; STATE;
D O I
10.1109/TNN.2009.2022980
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Conventional recurrent neural networks (RNNs) have difficulties in learning long-term dependencies. To tackle this problem, we propose an architecture called segmented-memory recurrent neural network (SMRNN). A symbolic sequence is broken into segments and then presented as inputs to the SMRNN one symbol per cycle. The SMRNN uses separate internal states to store symbol-level context, as well as segment-level context. The symbol-level context is updated for each symbol presented for input. The segment-level context is updated after each segment. The SMRNN is trained using an extended real-time recurrent learning algorithm. We test the performance of SMRNN on the information latching problem, the "two-sequence problem" and the problem of protein secondary structure (PSS) prediction. Our implementation results indicate that SMRNN performs better on long-term dependency problems than conventional RNNs. Besides, we also theoretically analyze how the segmented memory of SMRNN helps learning long-term temporal dependencies and study the impact of the segment length.
引用
收藏
页码:1267 / 1280
页数:14
相关论文
共 50 条
  • [1] [Anonymous], NUCCS909
  • [2] [Anonymous], P ADV NEUR INF PROC
  • [3] [Anonymous], 1993, PROC INT WORKSHOP NE
  • [4] LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT
    BENGIO, Y
    SIMARD, P
    FRASCONI, P
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02): : 157 - 166
  • [5] Learning long-term dependencies by the selective addition of time-delayed connections to recurrent neural networks
    Boné, R
    Crucianu, M
    de Beauville, JPA
    [J]. NEUROCOMPUTING, 2002, 48 : 251 - 266
  • [6] Chen JX, 2006, INT C COMP SUPP COOP, P296
  • [7] Improvement of bidirectional recurrent neural network for learning long-term dependencies
    Chen, JM
    Chaudhari, NS
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, 2004, : 593 - 596
  • [8] Chen JM, 2004, LECT NOTES COMPUT SC, V3173, P362
  • [9] An improved algorithm for learning long-term dependency problems in adaptive processing of data structures
    Cho, SY
    Chi, ZR
    Siu, WC
    Tsoi, AC
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2003, 14 (04): : 781 - 793
  • [10] Doboli S, 2004, IEEE IJCNN, P1469