Segmented-Memory Recurrent Neural Networks

被引：17

作者：

Chen, Jinmiao ^{[1
]}

Chaudhari, Narendra S. ^{[1
]}

机构：

[1] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS | 2009年 / 20卷 / 08期

关键词：

Gradient descent; information latching; long-term dependencies; recurrent neural networks (RNNs); segmented memory; vanishing gradient; PROTEIN SECONDARY STRUCTURE; LONG-TERM DEPENDENCIES; STRUCTURE PREDICTION; GRADIENT-DESCENT; REHEARSAL; ALGORITHM; STATE;

D O I：

10.1109/TNN.2009.2022980

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Conventional recurrent neural networks (RNNs) have difficulties in learning long-term dependencies. To tackle this problem, we propose an architecture called segmented-memory recurrent neural network (SMRNN). A symbolic sequence is broken into segments and then presented as inputs to the SMRNN one symbol per cycle. The SMRNN uses separate internal states to store symbol-level context, as well as segment-level context. The symbol-level context is updated for each symbol presented for input. The segment-level context is updated after each segment. The SMRNN is trained using an extended real-time recurrent learning algorithm. We test the performance of SMRNN on the information latching problem, the "two-sequence problem" and the problem of protein secondary structure (PSS) prediction. Our implementation results indicate that SMRNN performs better on long-term dependency problems than conventional RNNs. Besides, we also theoretically analyze how the segmented memory of SMRNN helps learning long-term temporal dependencies and study the impact of the segment length.

引用

页码：1267 / 1280

页数：14

共 50 条

[1]

[Anonymous], NUCCS909

[2]

[Anonymous], P ADV NEUR INF PROC

[3]

[Anonymous], 1993, PROC INT WORKSHOP NE

[4] LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].

BENGIO, Y ;

SIMARD, P ;

FRASCONI, P .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166

[5] Learning long-term dependencies by the selective addition of time-delayed connections to recurrent neural networks [J].

Boné, R ;

Crucianu, M ;

de Beauville, JPA .

NEUROCOMPUTING, 2002, 48 :251-266

[6]

Chen JX, 2006, INT C COMP SUPP COOP, P296

[7] Improvement of bidirectional recurrent neural network for learning long-term dependencies [J].

Chen, JM ;

Chaudhari, NS .

PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, 2004, :593-596

[8]

Chen JM, 2004, LECT NOTES COMPUT SC, V3173, P362

[9] An improved algorithm for learning long-term dependency problems in adaptive processing of data structures [J].

Cho, SY ;

Chi, ZR ;

Siu, WC ;

Tsoi, AC .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 2003, 14 (04) :781-793

[10]

Doboli S, 2004, IEEE IJCNN, P1469

← 1 2 3 4 5 →