DRRNets: Dynamic Recurrent Routing via Low-Rank Regularization in Recurrent Neural Networks

被引:12
作者
Shan, Dongjing [1 ]
Luo, Yong [2 ,3 ]
Zhang, Xiongwei [4 ]
Zhang, Chao [5 ]
机构
[1] Army Engn Univ, Lab Intelligent Informat Proc, Chongqing 400035, Peoples R China
[2] Wuhan Univ, Inst Artificial Intelligence, Natl Engn Res Ctr Multimedia Software, Sch Comp Sci, Wuhan 430072, Peoples R China
[3] Wuhan Univ, Hubei Key Lab Multimedia & Network Commun Engn, Wuhan 430072, Peoples R China
[4] Army Engn Univ, Speech Proc Lab, Nanjing 210007, Peoples R China
[5] Peking Univ, Key Lab Machine Percept MOE, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
Routing; Recurrent neural networks; Memory management; Task analysis; Logic gates; Computational modeling; Training; Long-term memory; low rank; recurrent neural network (RNN); sparsity projection; temporal dependency; vanishing gradients; SHORT-TERM-MEMORY;
D O I
10.1109/TNNLS.2021.3105818
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recurrent neural networks (RNNs) continue to show outstanding performance in sequence learning tasks such as language modeling, but it remains difficult to train RNNs for long sequences. The main challenges lie in the complex dependencies, gradient vanishing or exploding, and low resource requirement in model deployment. In order to address these challenges, we propose dynamic recurrent routing neural networks (DRRNets), which can: 1) shorten the recurrent lengths by allocating recurrent routes dynamically for different dependencies and 2) reduce the number of parameters significantly by imposing low-rank constraints on the fully connected layers. A novel optimization algorithm via low-rank constraint and sparsity projection is developed to train the network. We verify the effectiveness of the proposed method by comparing it with multiple competitive approaches in several popular sequential learning tasks, such as language modeling and speaker recognition. The results in terms of different criteria demonstrate the superiority of our proposed method.
引用
收藏
页码:2057 / 2067
页数:11
相关论文
共 52 条
[1]  
[Anonymous], 2018, ICLR
[2]  
Arjovsky M, 2016, PR MACH LEARN RES, V48
[3]  
Bai S, 2018, arXiv
[4]  
Barone AVM, 2018, DEEP LEARNING APPROACHES FOR LOW-RESOURCE NATURAL LANGUAGE PROCESSING (DEEPLO), P77
[5]  
Campos V., 2017, ARXIV170806834
[6]  
Chang B., 2019, P INT C LEARN REPR
[7]  
Chang SY, 2017, ADV NEUR IN, V30
[8]   Segmented-Memory Recurrent Neural Networks [J].
Chen, Jinmiao ;
Chaudhari, Narendra S. .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2009, 20 (08) :1267-1280
[9]  
Cho Kyunghyun, P 2014 C EMP METH NA, DOI 10.3115/v1/D14-1179
[10]  
Chung J., 2017, P 5 INT C LEARN REPR