Multi-candidate missing data imputation for robust speech recognition

被引:0
作者
Yujun Wang
Hugo Van hamme
机构
[1] Katholieke Universiteit Leuven,Department of ESAT
来源
EURASIP Journal on Audio, Speech, and Music Processing | / 2012卷
关键词
speech recognition; constrained optimization; missing data; noise robustness;
D O I
暂无
中图分类号
学科分类号
摘要
The application of Missing Data Techniques (MDT) to increase the noise robustness of HMM/GMM-based large vocabulary speech recognizers is hampered by a large computational burden. The likelihood evaluations imply solving many constrained least squares (CLSQ) optimization problems. As an alternative, researchers have proposed frontend MDT or have made oversimplifying independence assumptions for the backend acoustic model. In this article, we propose a fast Multi-Candidate (MC) approach that solves the per-Gaussian CLSQ problems approximately by selecting the best from a small set of candidate solutions, which are generated as the MDT solutions on a reduced set of cluster Gaussians. Experiments show that the MC MDT runs equally fast as the uncompensated recognizer while achieving the accuracy of the full backend optimization approach. The experiments also show that exploiting the more accurate acoustic model of the backend does pay off in terms of accuracy when compared to frontend MDT.
引用
收藏
相关论文
共 26 条
  • [1] Leggetter CJ(1995)Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models Comput Speech Lang 9 171-185
  • [2] Woodland PC(1994)Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains IEEE Trans Speech Audio Process 2 291-298
  • [3] Gauvain JL(2006)A glimpsing model of speech perception in noise J Acoust Soc Am 119 1562-1573
  • [4] Lee CH(2005)Decoding speech in the presence of other sources Speech Commun 45 5-25
  • [5] Cooke M(2007)Transforming binary uncertainties for robust speech recognition IEEE Trans Audio Speech Lang Process 15 2130-2140
  • [6] Barker JP(2010)Robust speech recognition by integrating speech separation and hypothesis testing Speech Commun 52 72-81
  • [7] Cooke MP(2001)A Vizinho, Robust automatic speech recognition with missing and unreliable acoustic data Speech Commun 34 267-285
  • [8] Ellis DPW(2004)A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition Speech Commun 43 379-393
  • [9] Srinivasan S(2007)On noise masking for automatic missing data speech recognition: a survey and discussion Comput Speech Lang 21 443-457
  • [10] Wang D(2011)Advances in missing feature techniques for robust large vocabulary continuous speech recognition IEEE Trans Audio Speech Lang Process 19 123-137