A study of variable-parameter Gaussian mixture hidden Markov modeling for noisy speech recognition

被引:30
作者
Cui, Xiaodong
Gong, Yifan
机构
[1] Univ Calif Los Angeles, Dept Elect Engn, Los Angeles, CA 90024 USA
[2] Microsoft Corp, Redmond, WA 98052 USA
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2007年 / 15卷 / 04期
关键词
hidden Markov model; noise robust speech recognition; polynomial regression; variable parameter;
D O I
10.1109/TASL.2006.889791
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
To improve recognition performance in noisy environments, multicondition training is usually applied in which speech signals corrupted by a variety of noise are used in acoustic model training. Published hidden Markov modeling of speech uses multiple Gaussian distributions to cover the spread of the speech distribution caused by noise, which distracts the modeling of speech event itself and possibly sacrifices the performance on clean speech. In this paper, we propose a novel approach which extends the conventional Gaussian mixture hidden Markov model (GMHMM) by modeling state emission parameters (mean and variance) as a polynomial function of a continuous environment-dependent variable. At the recognition time, a set of HMMs specific to the given value of the environment variable is instantiated and used for recognition. The maximum-likelihood (ML) estimation of the polynomial functions of the proposed variable-parameter GMHMM is given within the expectation-maximization (EM) framework. Experiments on the Aurora 2 database show significant improvements of the variable-parameter Gaussian mixture HMMs compared to the conventional GMHMMs.
引用
收藏
页码:1366 / 1376
页数:11
相关论文
共 29 条
  • [1] AFIFY M, 2001, P EUROSPEECH, P633
  • [2] BLANCHET M, 1992, P EUSIPCO 92, V6, P391
  • [3] A data-driven model parameter compensation method for noise-robust speech recognition
    Chung, YJ
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (03): : 432 - 434
  • [4] COMPERNOLLE DV, 1989, COMPUTER SPEECH LANG, V3, P151
  • [5] CUI X, 2003, P IEEE INT C AC SPEE, V1, P12
  • [6] DAS S, 1993, P IEEE INT C AC SPEE, P71
  • [7] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [8] Furui S., 1992, ESCA WORKSH SPEECH P, P31
  • [9] GALES M, 1999, ESCA NATO WORKSH ROB, P55
  • [10] Gales M. J. F., 1995, THESIS U CAMBRIDGE C