Rapid online adaptation using speaker space model evolution

被引:2
作者
Kim, DK
Kim, NS
机构
[1] Elect & Telecommun Res Inst, Comp & Software Res Lab, Taejon 305350, South Korea
[2] Seoul Natl Univ, Sch Elect Engn, Seoul 151742, South Korea
[3] Seoul Natl Univ, INMC, Seoul 151742, South Korea
关键词
speaker space model; prior evolution; latent variable model; Quasi-Bayes estimate; online adaptation; rapid speaker adaptation;
D O I
10.1016/j.specom.2004.01.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a new approach to online adaptation of continuous density hidden Markov model (CDHMM) with a small amount of adaptation data based on speaker space model (SSM) evolution. The SSM which characterizes the a priori knowledge of the training speakers is effectively described in terms of the latent variable models such as the factor analysis or probabilistic principal component analysis. The SSM provides various sources of information such as the correlation information, the prior density, and the prior knowledge of the CDHMM parameters that are very useful for rapid online adaptation. We design the SSM evolution based on the quasi-Bayes estimation technique which incrementally updates the hyperparameters of the SSM and the CDHMM parameters simultaneously. In a series of speaker adaptation experiments on the continuous digit and large vocabulary recognition tasks, we demonstrate that the proposed approach not only achieves a good performance for a small amount of adaptation data but also maintains a good asymptotic convergence property as the data size increases. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:467 / 478
页数:12
相关论文
共 28 条
  • [1] BOTTERWECK H, 2001, P IEEE INT C AC SPEE
  • [2] CHEN KT, 2001, P IEEE INT C AC SPEE
  • [3] CHEN KT, 2000, P INT C SPOK LANG PR, P742
  • [4] Quasi-Bayes linear regression for sequential learning of hidden Markov models
    Chien, JT
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (05): : 268 - 278
  • [5] Online hierarchical transformation of hidden Markov models for speech recognition
    Chien, JT
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (06): : 656 - 667
  • [6] Chou W., 1999, P EUR C SPEECH COMM, P1
  • [7] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [8] Cluster adaptive training of hidden Markov models
    Gales, MJF
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (04): : 417 - 428
  • [9] Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains
    Gauvain, Jean-Luc
    Lee, Chin-Hui
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02): : 291 - 298
  • [10] Huo Q, 1997, IEEE T SPEECH AUDI P, V5, P161, DOI 10.1109/89.554778