Spontaneous Temporal Grouping Neural Network for Long-Term Memory Modeling

被引:2
作者
Shan, Dongjing [1 ]
Zhang, Xiongwei [1 ]
Zhang, Chao [2 ]
机构
[1] Army Engn Univ, Lab Intelligent Informat Proc, Nanjing 210007, Peoples R China
[2] Peking Univ, Key Lab Machine Percept MOE, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
Computer architecture; Logic gates; Microprocessors; Training; Standards; Data models; Task analysis; Long-term memory; recurrent neural network; temporal dependency; temporal grouping; vanishing gradient; RECALL;
D O I
10.1109/TCDS.2021.3050759
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The capacity of long-term memory is an important issue in sequence learning, but it remains challenging for the problems of vanishing gradients or out-of-order dependencies. Inspired by human memory, in which long-term memory is broken into fragments and then can be recalled at appropriate times, we propose a neural network via spontaneous temporal grouping in this article. In the architecture, the segmented layer is used for spontaneous sequence segmentation under guidance of the reset gates which are driven to be sparse in the training process; the cascading layer is used to collect information from the temporal groups, where a filtered long short-term memory with chrono-initialization is proposed to alleviate the gradient vanishing phenomenon, and random skip connections are adopted to capture complex dependencies among the groups. Furthermore, the advantage of our neural architecture in long-term memory is demonstrated via a new measurement method. In experiments, we compare the performance with multiple models on several algorithmic or classification tasks, and both of the sequences with fixed lengths like the MNISTs and with varying lengths like the speech utterances are adopted. The results in different criteria have demonstrated the superiority of our proposed neural network.
引用
收藏
页码:472 / 484
页数:13
相关论文
共 47 条
  • [1] [Anonymous], 2002, TUTORIAL TRAINING RE
  • [2] [Anonymous], 2012, Coursera, Video Lectures
  • [3] [Anonymous], 2013, CoRR abs/1308.3432
  • [4] [Anonymous], 2015, CoRR
  • [5] [Anonymous], 2013, P 30 INT C MACH LEAR
  • [6] Arjovsky M, 2016, PR MACH LEARN RES, V48
  • [7] LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT
    BENGIO, Y
    SIMARD, P
    FRASCONI, P
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02): : 157 - 166
  • [8] Campos V., 2017, ARXIV170806834
  • [9] Chang SY, 2017, ADV NEUR IN, V30
  • [10] Segmented-Memory Recurrent Neural Networks
    Chen, Jinmiao
    Chaudhari, Narendra S.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2009, 20 (08): : 1267 - 1280