USING KL-DIVERGENCE AND MULTILINGUAL INFORMATION TO IMPROVE ASR FOR UNDER-RESOURCED LANGUAGES

被引:0
作者
Imseng, David [1 ]
Bourlard, Herve [1 ]
Garner, Philip N. [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
来源
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2012年
关键词
Multilingual speech recognition; neural network features; fast training; Kullback-Leibler divergence;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Setting out from the point of view that automatic speech recognition (ASR) ought to benefit from data in languages other than the target language, we propose a novel Kullback-Leibler (KL) divergence based method that is able to exploit multilingual information in the form of universal phoneme posterior probabilities conditioned on the acoustics. We formulate a means to train a recognizer on several different languages, and subsequently recognize speech in a target language for which only a small amount of data is available. Taking the Greek SpeechDat(II) data as an example, we show that the proposed formulation is sound, and show that it is able to outperform a current state-of-the-art HMM/GMM system. We also use a hybrid Tandem-like system to further understand the source of the benefit.
引用
收藏
页码:4869 / 4872
页数:4
相关论文
共 12 条
  • [1] [Anonymous], P INT ANTW BELG AUG
  • [2] Aradilla G., 2008, P INT
  • [3] Bisani M, 2004, 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS, P409
  • [4] Imseng D., 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), P348, DOI 10.1109/ASRU.2011.6163956
  • [5] Imseng D, 2010, 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, P278
  • [6] Imseng D, 2011, INT CONF ACOUST SPEE, P5012
  • [7] Kohler J, 1998, INT CONF ACOUST SPEE, P417, DOI 10.1109/ICASSP.1998.674456
  • [8] LE VB, 2005, P ICASSP, P821
  • [9] A TUTORIAL ON HIDDEN MARKOV-MODELS AND SELECTED APPLICATIONS IN SPEECH RECOGNITION
    RABINER, LR
    [J]. PROCEEDINGS OF THE IEEE, 1989, 77 (02) : 257 - 286
  • [10] Scanzio S, 2008, INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, P2711