Sequence Discriminative Distributed Training of Long Short-Term Memory Recurrent Neural Networks

被引:0
作者
Sak, Hasim [1 ]
Vinyals, Oriol [1 ]
Heigold, Georg [1 ]
Senior, Andrew [1 ]
McDermott, Erik [1 ]
Monga, Rajat [1 ]
Mao, Mark [1 ]
机构
[1] Google, Mountain View, CA 94043 USA
来源
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年
关键词
recurrent neural network; long short-term memory; sequence discriminative training; acoustic modeling;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We recently showed that Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) outperform state-of-the-art deep neural networks (DNNs) for large scale acoustic modeling where the models were trained with the cross-entropy (CE) criterion. It has also been shown that sequence discriminative training of DNNs initially trained with the CE criterion gives significant improvements. In this paper, we investigate sequence discriminative training of LSTM RNNs in a large scale acoustic modeling task. We train the models in a distributed manner using asynchronous stochastic gradient descent optimization technique. We compare two sequence discriminative criteria maximum mutual information and state-level minimum Bayes risk, and we investigate a number of variations of the basic training strategy to better understand issues raised by both the sequential model, and the objective function. We obtain significant gains over the CE trained LSTM RNN model using sequence discriminative training techniques.
引用
收藏
页码:1209 / 1213
页数:5
相关论文
共 50 条
[31]   Ensemble of recurrent neural networks with long short-term memory cells for high-rate structural health monitoring [J].
Barzegar, Vahid ;
Laflamme, Simon ;
Hu, Chao ;
Dodson, Jacob .
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2022, 164
[32]   State of Charge Estimation using Recurrent Neural Networks with Long Short-Term Memory for Lithium-Ion Batteries [J].
Bockrath, S. ;
Rosskopf, A. ;
Koffel, S. ;
Waldhoer, S. ;
Srivastava, K. ;
Lorentz, V. R. H. .
45TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY (IECON 2019), 2019, :2507-2511
[33]   Empirical modeling of ethanol production dynamics using long short-term memory recurrent neural networks [J].
Sousa F.M.M. ;
Fonseca R.R. ;
da Silva F.V. .
Bioresource Technology Reports, 2021, 15
[34]   Early Forecasting of Rice Blast Disease Using Long Short-Term Memory Recurrent Neural Networks [J].
Kim, Yangseon ;
Roh, Jae-Hwan ;
Kim, Ha Young .
SUSTAINABILITY, 2018, 10 (01)
[35]   Remaining useful life prediction of PEMFC based on long short-term memory recurrent neural networks [J].
Liu, Jiawei ;
Li, Qi ;
Chen, Weirong ;
Yan, Yu ;
Qiu, Yibin ;
Cao, Taiqiong .
INTERNATIONAL JOURNAL OF HYDROGEN ENERGY, 2019, 44 (11) :5470-5480
[36]   Identifying behavioural change among drivers using Long Short-Term Memory recurrent neural networks [J].
Wijnands, Jasper S. ;
Thompson, Jason ;
Aschwanden, Gideon D. P. A. ;
Stevenson, Mark .
TRANSPORTATION RESEARCH PART F-TRAFFIC PSYCHOLOGY AND BEHAVIOUR, 2018, 53 :34-49
[37]   Brain Decoding from Functional MRI Using Long Short-Term Memory Recurrent Neural Networks [J].
Li, Hongming ;
Fan, Yong .
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, PT III, 2018, 11072 :320-328
[38]   Short-Term Recommendation With Recurrent Neural Networks [J].
Chu, Yan ;
Huang, Fang ;
Wang, Hongbin ;
Li, Guang ;
Song, Xuemeng .
2017 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION (ICMA), 2017, :927-932
[39]   Efficient Neural Architecture Search for Long Short-Term Memory Networks [J].
Abed, Hamdi ;
Gyires-Toth, Balint .
2021 IEEE 19TH WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI 2021), 2021, :287-292
[40]   On extended long short-term memory and dependent bidirectional recurrent neural network [J].
Su, Yuanhang ;
Kuo, C-C Jay .
NEUROCOMPUTING, 2019, 356 :151-161