Sequence Discriminative Distributed Training of Long Short-Term Memory Recurrent Neural Networks

被引:0
作者
Sak, Hasim [1 ]
Vinyals, Oriol [1 ]
Heigold, Georg [1 ]
Senior, Andrew [1 ]
McDermott, Erik [1 ]
Monga, Rajat [1 ]
Mao, Mark [1 ]
机构
[1] Google, Mountain View, CA 94043 USA
来源
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年
关键词
recurrent neural network; long short-term memory; sequence discriminative training; acoustic modeling;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We recently showed that Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) outperform state-of-the-art deep neural networks (DNNs) for large scale acoustic modeling where the models were trained with the cross-entropy (CE) criterion. It has also been shown that sequence discriminative training of DNNs initially trained with the CE criterion gives significant improvements. In this paper, we investigate sequence discriminative training of LSTM RNNs in a large scale acoustic modeling task. We train the models in a distributed manner using asynchronous stochastic gradient descent optimization technique. We compare two sequence discriminative criteria maximum mutual information and state-level minimum Bayes risk, and we investigate a number of variations of the basic training strategy to better understand issues raised by both the sequential model, and the objective function. We obtain significant gains over the CE trained LSTM RNN model using sequence discriminative training techniques.
引用
收藏
页码:1209 / 1213
页数:5
相关论文
共 50 条
  • [21] Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling
    Sak, Hasim
    Senior, Andrew
    Beaufays, Francoise
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 338 - 342
  • [22] Using Ant Colony Optimization to Optimize Long Short-Term Memory Recurrent Neural Networks
    ElSaid, AbdElRahman
    El Jamiy, Fatima
    Higgins, James
    Wild, Brandon
    Desell, Travis
    GECCO'18: PROCEEDINGS OF THE 2018 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2018, : 13 - 20
  • [23] Restoration of Missing Pressures in a Gas Well Using Recurrent Neural Networks with Long Short-Term Memory Cells
    Ki, Seil
    Jang, Ilsik
    Cha, Booho
    Seo, Jeonggyu
    Kwon, Oukwang
    ENERGIES, 2020, 13 (18)
  • [24] Forecasting Groundwater Table in a Flood Prone Coastal City with Long Short-term Memory and Recurrent Neural Networks
    Bowes, Benjamin D.
    Sadler, Jeffrey M.
    Morsy, Mohamed M.
    Behl, Madhur
    Goodall, Jonathan L.
    WATER, 2019, 11 (05)
  • [25] CONSTRUCTING LONG SHORT-TERM MEMORY BASED DEEP RECURRENT NEURAL NETWORKS FOR LARGE VOCABULARY SPEECH RECOGNITION
    Li, Xianggang
    Wu, Xihong
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4520 - 4524
  • [26] Forecasting cryptocurrency prices using Recurrent Neural Network and Long Short-term Memory
    Nasirtafreshi, I.
    DATA & KNOWLEDGE ENGINEERING, 2022, 139
  • [27] An Interpretation of Long Short-Term Memory Recurrent Neural Network for Approximating Roots of Polynomials
    Bukhsh, Madiha
    Ali, Muhammad Saqib
    Ashraf, Muhammad Usman
    Alsubhi, Khalid
    Chen, Weiqiu
    IEEE ACCESS, 2022, 10 : 28194 - 28205
  • [28] Long short-term memory recurrent neural network architectures for Urdu acoustic modeling
    Tehseen Zia
    Usman Zahid
    International Journal of Speech Technology, 2019, 22 : 21 - 30
  • [29] Long short-term memory recurrent neural network architectures for Urdu acoustic modeling
    Zia, Tehseen
    Zahid, Usman
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (01) : 21 - 30
  • [30] Short-term Load Forecasting with Distributed Long Short-Term Memory
    Dong, Yi
    Chen, Yang
    Zhao, Xingyu
    Huang, Xiaowei
    2023 IEEE POWER & ENERGY SOCIETY INNOVATIVE SMART GRID TECHNOLOGIES CONFERENCE, ISGT, 2023,