Multi-candidate missing data imputation for robust speech recognition

被引：0

作者：

Yujun Wang

Hugo Van hamme

机构：

[1] Katholieke Universiteit Leuven,Department of ESAT

来源：

EURASIP Journal on Audio, Speech, and Music Processing | / 2012卷

关键词：

speech recognition; constrained optimization; missing data; noise robustness;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The application of Missing Data Techniques (MDT) to increase the noise robustness of HMM/GMM-based large vocabulary speech recognizers is hampered by a large computational burden. The likelihood evaluations imply solving many constrained least squares (CLSQ) optimization problems. As an alternative, researchers have proposed frontend MDT or have made oversimplifying independence assumptions for the backend acoustic model. In this article, we propose a fast Multi-Candidate (MC) approach that solves the per-Gaussian CLSQ problems approximately by selecting the best from a small set of candidate solutions, which are generated as the MDT solutions on a reduced set of cluster Gaussians. Experiments show that the MC MDT runs equally fast as the uncompensated recognizer while achieving the accuracy of the full backend optimization approach. The experiments also show that exploiting the more accurate acoustic model of the backend does pay off in terms of accuracy when compared to frontend MDT.

引用

共 26 条

[1] Leggetter CJ(1995)Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models Comput Speech Lang 9 171-185
[2] Woodland PC(1994)Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains IEEE Trans Speech Audio Process 2 291-298
[3] Gauvain JL(2006)A glimpsing model of speech perception in noise J Acoust Soc Am 119 1562-1573
[4] Lee CH(2005)Decoding speech in the presence of other sources Speech Commun 45 5-25
[5] Cooke M(2007)Transforming binary uncertainties for robust speech recognition IEEE Trans Audio Speech Lang Process 15 2130-2140
[6] Barker JP(2010)Robust speech recognition by integrating speech separation and hypothesis testing Speech Commun 52 72-81
[7] Cooke MP(2001)A Vizinho, Robust automatic speech recognition with missing and unreliable acoustic data Speech Commun 34 267-285
[8] Ellis DPW(2004)A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition Speech Commun 43 379-393
[9] Srinivasan S(2007)On noise masking for automatic missing data speech recognition: a survey and discussion Comput Speech Lang 21 443-457
[10] Wang D(2011)Advances in missing feature techniques for robust large vocabulary continuous speech recognition IEEE Trans Audio Speech Lang Process 19 123-137

← 1 2 3 →