Crosslingual acoustic model development for automatic speech recognition

被引:0
作者
Diehl, Frank [1 ]
Moreno, Asuncion [1 ]
Monte, Enric [1 ]
机构
[1] Univ Politecn Cataluna, TALP Res Ctr, ES-08034 Barcelona, Spain
来源
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2 | 2007年
关键词
crosslingual; acoustic modelling;
D O I
10.1109/ASRU.2007.4430150
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work we discuss the development of two crosslingual acoustic model sets for automatic speech recognition (ASR). The starting point is a set of multilingual Spanish-English-German hidden Markov models (HMMs). The target languages are Slovenian and French. During the discussion the problem of defining a multilingual phoneme set and the associated dictionary mapping is considered. A method is described to circumvent related problems. The impact of the acoustic source models on the performance of the target systems is analyzed in detail. Several crosslingual defined target systems are built and compared to their monolingual counterparts. It is shown that crosslingual build acoustic models clearly outperform pure monolingual models if only a limited amount of target data is available.
引用
收藏
页码:425 / 430
页数:6
相关论文
共 12 条
[1]  
[Anonymous], 1998, P INT C SPOKEN LANGU
[2]  
[Anonymous], ARPA HLT WORKSH
[3]  
Box GEP, 1978, STAT EXPT
[4]  
BYRNE W, 1999, LANGUAGE INDEPENDENT
[5]  
COHEN P, 1997, AUTOMATIC SPEECH DEC, P591
[6]  
DIEHL F, 2006, P 14 EUR SIGN PROC C
[7]  
Liu C., 2005, INT C SPEECH LANG PR, P1365
[8]   The demiphone:: An efficient contextual subword unit for continuous speech recognition [J].
Mariño, JB ;
Nogueiras, A ;
Pachès-Leal, P ;
Bonafonte, A .
SPEECH COMMUNICATION, 2000, 32 (03) :187-197
[9]   Cross-language use of acoustic information for automatic speech recognition [J].
Nieuwoudt, C ;
Botha, EC .
SPEECH COMMUNICATION, 2002, 38 (1-2) :101-113
[10]   Language-independent and language-adaptive acoustic modeling for speech recognition [J].
Schultz, T ;
Waibel, A .
SPEECH COMMUNICATION, 2001, 35 (1-2) :31-51