Segmented-Memory Recurrent Neural Networks

被引:17
作者
Chen, Jinmiao [1 ]
Chaudhari, Narendra S. [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2009年 / 20卷 / 08期
关键词
Gradient descent; information latching; long-term dependencies; recurrent neural networks (RNNs); segmented memory; vanishing gradient; PROTEIN SECONDARY STRUCTURE; LONG-TERM DEPENDENCIES; STRUCTURE PREDICTION; GRADIENT-DESCENT; REHEARSAL; ALGORITHM; STATE;
D O I
10.1109/TNN.2009.2022980
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Conventional recurrent neural networks (RNNs) have difficulties in learning long-term dependencies. To tackle this problem, we propose an architecture called segmented-memory recurrent neural network (SMRNN). A symbolic sequence is broken into segments and then presented as inputs to the SMRNN one symbol per cycle. The SMRNN uses separate internal states to store symbol-level context, as well as segment-level context. The symbol-level context is updated for each symbol presented for input. The segment-level context is updated after each segment. The SMRNN is trained using an extended real-time recurrent learning algorithm. We test the performance of SMRNN on the information latching problem, the "two-sequence problem" and the problem of protein secondary structure (PSS) prediction. Our implementation results indicate that SMRNN performs better on long-term dependency problems than conventional RNNs. Besides, we also theoretically analyze how the segmented memory of SMRNN helps learning long-term temporal dependencies and study the impact of the segment length.
引用
收藏
页码:1267 / 1280
页数:14
相关论文
共 50 条
[1]  
[Anonymous], NUCCS909
[2]  
[Anonymous], P ADV NEUR INF PROC
[3]  
[Anonymous], 1993, PROC INT WORKSHOP NE
[4]   LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].
BENGIO, Y ;
SIMARD, P ;
FRASCONI, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166
[5]   Learning long-term dependencies by the selective addition of time-delayed connections to recurrent neural networks [J].
Boné, R ;
Crucianu, M ;
de Beauville, JPA .
NEUROCOMPUTING, 2002, 48 :251-266
[6]  
Chen JX, 2006, INT C COMP SUPP COOP, P296
[7]   Improvement of bidirectional recurrent neural network for learning long-term dependencies [J].
Chen, JM ;
Chaudhari, NS .
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, 2004, :593-596
[8]  
Chen JM, 2004, LECT NOTES COMPUT SC, V3173, P362
[9]   An improved algorithm for learning long-term dependency problems in adaptive processing of data structures [J].
Cho, SY ;
Chi, ZR ;
Siu, WC ;
Tsoi, AC .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2003, 14 (04) :781-793
[10]  
Doboli S, 2004, IEEE IJCNN, P1469