Densely Connected Networks for Conversational Speech Recognition

被引:5
作者
Han, Kyu J. [1 ]
Chandrashekaran, Akshay [1 ]
Kim, Jungsuk [1 ]
Lane, Ian [1 ]
机构
[1] Capio Inc, Belmont, CA 94002 USA
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
关键词
Densely connected LSTM; Switchboard; conversational speech recognition;
D O I
10.21437/Interspeech.2018-1486
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we show how we have achieved the state-of-the-art performance on the industry-standard NIST 2000 Hub5 English evaluation set. We propose densely connected LSTMs (namely, dense LSTMs), inspired by the densely connected convolutional neural networks recently introduced for image classification tasks. It is shown that the proposed dense LSTMs would provide more reliable performance as compared to the conventional, residual LSTMs as more LSTM layers are stacked in neural networks. With RNN-LM rescoring and lattice combination on the 5 systems (including 2 dense LSTM based systems) trained across three different phone sets, Capio's conversational speech recognition system has obtained 5.0% and 9.1% on Switchboard and CallHome, respectively.
引用
收藏
页码:796 / 800
页数:5
相关论文
共 33 条
[1]  
[Anonymous], 2016, MSRTR201671
[2]  
[Anonymous], 2017, IEEE C COMPUTER VISI, DOI DOI 10.1109/CVPR.2017.243
[3]  
[Anonymous], P NIPS 2011
[4]  
Ehrlich S, 2017, INT CONF ORANGE TECH, P176, DOI 10.1109/ICOT.2017.8336116
[5]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[6]  
Huang L., 2017, P ICCCS 2017
[7]  
Kim J., 2017, P INTERSPEECH 2017
[8]  
Krizhevsky A., 2009, Learning multiple layers of features from tiny images
[9]  
Kurata G., 2017, P ASRU 2017
[10]  
KUSHWAHA AKS, 2015, PROC OF ICML 2015, P159