Long Short-Term Memory Based Language Model for Indonesian Spontaneous Speech Recognition

被引:0
作者
Putri, Fanda Yuliana [1 ]
Lestari, Dessi Puji [1 ]
Widyantoro, Dwi Hendratmo [1 ]
机构
[1] Bandung Inst Technol, Sch Elect Engn & Informat, Bandung, Indonesia
来源
2018 INTERNATIONAL CONFERENCE ON COMPUTER, CONTROL, INFORMATICS AND ITS APPLICATIONS (IC3INA) | 2018年
关键词
speech recognition system; ASR; spontaneous; language model; perplexity; LSTM; n-gram;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A robust recognition performance in daily or spontaneous conversation becomes necessary for a speech recognizer when deployed in real world applications. Meanwhile, the Indonesian speech recognition system (ASR) still has poor performance compared to dictated speech. In this work, we used deep neural networks approach, focused primarily on using long short-term memory (LSTM) to improve the language model performance as it has been successfully applied to many long context-dependent problems including language modeling. We tried different architectures and parameters to get the optimal combination, including deep LSTMs and LSTM with projection layer (LSTMP). Thereafter, different type of corpus was employed to enrich the language model linguistically. All our LSTM language models achieved significant improvement in terms of perplexity and word error rate (%WER) compared to n-gram as the baseline. The perplexity improvement was up to 50.6% and best WER reduction was 3.61% as evaluated with Triphone GMM- HMM acoustic model. The optimal architecture combination we got is deep LSTMP with L2 regularization.
引用
收藏
页码:44 / 48
页数:5
相关论文
共 15 条
  • [1] LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT
    BENGIO, Y
    SIMARD, P
    FRASCONI, P
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02): : 157 - 166
  • [2] Bengio Y, 2001, ADV NEUR IN, V13, P932
  • [3] Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]
  • [4] Towards Robust Indonesian Speech Recognition with Spontaneous-Speech Adapted Acoustic Models
    Hoesen, Devin
    Satriawan, Cil Hardianto
    Lestari, Dessi Puji
    Khodra, Masayu Leylia
    [J]. SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES, 2016, 81 : 167 - 173
  • [5] Jia YK, 2017, J ROBOT, V2017, DOI 10.1155/2017/2061827
  • [6] Jurafsky D., 2009, NATURAL LANGUAGE PRO
  • [7] Lestari D. P., 2016, 15 IND SCI C JAP P
  • [8] Liu Y, 2014, COMM COM INF SC, V493, P1
  • [9] Mikolov T, 2010, 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, P1045
  • [10] Nakamura M, 2008, ISCA CSL, V22, P2