LANGUAGE DEPENDENT UNIVERSAL PHONEME POSTERIOR ESTIMATION FOR MIXED LANGUAGE SPEECH RECOGNITION

被引：0

作者：

Imseng, David ^{[1
]}

Bourlard, Herve ^{[1
]}

Magimai-Doss, Mathew ^{[1
]}

Dines, John ^{[1
]}

机构：

[1] Idiap Res Inst, Martigny, Switzerland

来源：

2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2011年

关键词：

Speech recognition; Mixed language speech recognition; Non-native speech; Acoustic model combination; Universal phoneme set;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a new approach to estimate "universal" phoneme posterior probabilities for mixed language speech recognition. More specifically, we propose a new theoretical framework to combine phoneme class posterior probabilities in a principled way by using (statistical) evidence about the language identity. We investigate the proposed approach in a mixed language environment (Speech-Dat(II)) consisting of five European languages. Our studies show that the proposed approach can yield significant improvements on a mixed language task, while maintaining the performance on monolingual tasks. Additionally, through a case study, we also demonstrate the potential benefits of the proposed approach for non-native speech recognition.

引用

页码：5012 / 5015

页数：4

共 7 条

[1] Imseng D., 2010, P INTERSPEECH, V2010, P2722, DOI [10.21437/Interspeech.2010-721, DOI 10.21437/INTERSPEECH.2010-721]
[2] Imseng David, 2010, P INTERSPEECH, P278
[3] Multilingual phone models for vocabulary-independent speech recognition tasks
Köhler, J
[J]. SPEECH COMMUNICATION, 2001, 35 (1-2) : 21 - 30
[4] Raab Martin, 2008, P TSD BRNO CZECH REP, P485
[5] Neural Network Classifiers Estimate Bayesian a posteriori Probabilities
Richard, Michael D.
Lippmann, Richard P.
[J]. NEURAL COMPUTATION, 1991, 3 (04) : 461 - 483
[6] Language-independent and language-adaptive acoustic modeling for speech recognition
Schultz, T
Waibel, A
[J]. SPEECH COMMUNICATION, 2001, 35 (1-2) : 31 - 51
[7] Recognizing speech of goats, wolves, sheep and ... non-natives
Van Compernolle, D
[J]. SPEECH COMMUNICATION, 2001, 35 (1-2) : 71 - 79

← 1 →