Unsupervised Adaptation of Recurrent Neural Network Language Models

被引:20
作者
Gangireddy, Siva Reddy [1 ]
Swietojanski, Pawel [1 ]
Bell, Peter [1 ]
Renals, Steve [1 ]
机构
[1] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9AB, Midlothian, Scotland
来源
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年
基金
英国工程与自然科学研究理事会; 日本科学技术振兴机构;
关键词
RNNLM; LHUC; unsupervised adaptation; fine-tuning; MOB-Challenge;
D O I
10.21437/Interspeech.2016-1342
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recurrent neural network language models (RNNLMs) have been shown to consistently improve Word Error Rates (WERs) of large vocabulary speech recognition systems employing n gram LMs. In this paper we investigate supervised and unsupervised discriminative adaptation of RNNLMs in a broadcast transcription task to target domains defined by either genre or show. We have explored two approaches based on (1) scaling forward-propagated hidden activations (Learning Hidden Unit Contributions (LHUC) technique) and (2) direct fine-tuning of the parameters of the whole RNNLM. To investigate the effectiveness of the proposed methods we carry out experiments on multi-genre broadcast (MGB) data following the MGB-2015 challenge protocol. We observe small but significant improvements in WER compared to a strong unadapted RNNLM model.
引用
收藏
页码:2333 / 2337
页数:5
相关论文
共 31 条
  • [1] [Anonymous], 2002, P INTERSPEECH
  • [2] [Anonymous], 2011, IEEE 2011 WORKSHOP A
  • [3] [Anonymous], P ANN C INT SPEECH C
  • [4] Bacchiani M., 2003, P ICASSP 03 APR, V1
  • [5] Bell P, 2013, P INTERSPEECH
  • [6] Bell P., 2015, P ASRU DEC
  • [7] A neural probabilistic language model
    Bengio, Y
    Ducharme, R
    Vincent, P
    Jauvin, C
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) : 1137 - 1155
  • [8] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [9] Chen L., 2003, AC SPEECH SIGN PROC, V1
  • [10] Chen X, 2014, INTERSPEECH, P641