MEMORY VISUALIZATION FOR GATED RECURRENT NEURAL NETWORKS IN SPEECH RECOGNITION

被引:0
作者
Tang, Zhiyuan [1 ,3 ]
Shi, Ying [1 ]
Wang, Dong [1 ,2 ]
Feng, Yang [1 ]
Zhang, Shiyue [1 ]
机构
[1] Tsinghua Univ, RUT, CSLT, Beijing, Peoples R China
[2] Tsinghua Natl Lab Informat Sci & Technol, Beijing, Peoples R China
[3] Chinese Acad Sci, Chengdu Inst Comp Applicat, Beijing, Peoples R China
来源
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2017年
基金
中国国家自然科学基金;
关键词
long short-term memory; gated recurrent unit; visualization; residual learning; speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recurrent neural networks (RNNs) have shown clear superiority in sequence modeling, particularly the ones with gated units, such as long short-term memory (LSTM) and gated recurrent unit (GRU). However, the dynamic properties behind the remarkable performance remain unclear in many applications, e.g., automatic speech recognition (ASR). This paper employs visualization techniques to study the behavior of LSTM and GRU when performing speech recognition tasks. Our experiments show some interesting patterns in the gated memory, and some of them have inspired simple yet effective modifications on the network structure. We report two of such modifications: (1) lazy cell update in LSTM, and (2) shortcut connections for residual learning. Both modifications lead to more comprehensible and powerful networks.
引用
收藏
页码:2736 / 2740
页数:5
相关论文
共 21 条
[1]  
Amodei D., 2015, CoRR
[2]  
[Anonymous], ARXIV160208952
[3]  
Cho K., 2014, ARXIV, P103, DOI 10.3115/v1/w14-4012
[4]  
Chung Junyoung, 2014, Empirical evaluation of gated recurrent neural networks on sequence modeling
[5]  
Deng L., 2014, FOND T SIGN PROC, V7, P197, DOI DOI 10.1561/2000000039
[6]  
Erhan Dumitru, TECH REP
[7]  
Fei-Fei L., 2015, VISUALIZING UNDERSTA
[8]  
Graves A, 2014, PR MACH LEARN RES, V32, P1764
[9]  
Graves A, 2013, INT CONF ACOUST SPEE, P6645, DOI 10.1109/ICASSP.2013.6638947
[10]  
He K., 2016, P IEEE COMPUTER SOC, P770