Efficient training algorithms for HMM's using incremental estimation

被引:18
作者
Gotoh, Y [1 ]
Hochberg, MM [1 ]
Silverman, HF [1 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1998年 / 6卷 / 06期
基金
美国国家科学基金会;
关键词
HMM training algorithm; incremental estimation; MAP estimation;
D O I
10.1109/89.725320
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Typically, parameter estimation for a hidden Markov model (HMM) is performed using an expectation-maximization (EM) algorithm with the maximum-likelihood (ML) criterion. The EM algorithm is an iterative scheme that is well-defined and numerically stable, but convergence may require a large number of iterations. For speech recognition systems utilizing large amounts of training material, this results in long training times. This paper presents an incremental estimation approach to speed-up the training of HMM's without any loss of recognition performance, The algorithm selects a subset of data from the training set, updates the model parameters based on the subset, and then iterates the process until convergence of the parameters. The advantage of this approach is a substantial increase in the number of iterations of the EM algorithm per training token, which leads to faster training, In order to achieve reliable estimation from a small fraction of the complete data set at each iteration, two training criteria are studied; ML and maximum a posteriori (MAP) estimation. Experimental results show that the training of the incremental algorithms is substantially faster than the conventional (batch) method and suffers no loss of recognition performance. Furthermore, the incremental MAP based training algorithm improves performance over the batch version.
引用
收藏
页码:539 / 548
页数:10
相关论文
共 20 条
  • [1] SMOOTH ONLINE LEARNING ALGORITHMS FOR HIDDEN MARKOV-MODELS
    BALDI, P
    CHAUVIN, Y
    [J]. NEURAL COMPUTATION, 1994, 6 (02) : 307 - 318
  • [2] BALDI P, 1993, ADV NEURAL INFORMATI, V5
  • [3] A MAXIMIZATION TECHNIQUE OCCURRING IN STATISTICAL ANALYSIS OF PROBABILISTIC FUNCTIONS OF MARKOV CHAINS
    BAUM, LE
    PETRIE, T
    SOULES, G
    WEISS, N
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1970, 41 (01): : 164 - &
  • [4] Berger JO., 1985, Statistical Decision Theory and Bayesian Analysis, V2, DOI DOI 10.1007/978-1-4757-4286-2
  • [5] Box GE., 2011, BAYESIAN INFERENCE S
  • [6] DeGroot M., 1970, OPTIMAL STAT DECISIO
  • [7] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [8] Duda R.O., 1973, Pattern classification
  • [9] Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains
    Gauvain, Jean-Luc
    Lee, Chin-Hui
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02): : 291 - 298
  • [10] GOTOH Y, UNPUB MAP ESTIMATION