A survey of techniques for incremental learning of HMM parameters

被引:73
作者
Khreich, Wael [1 ]
Granger, Eric [1 ]
Miri, Ali [2 ]
Sabourin, Robert [1 ]
机构
[1] Univ Quebec, LIVIA, Ecole Technol Super, Montreal, PQ, Canada
[2] Ryerson Univ, Sch Comp Sci, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Incremental learning; On-line learning; Hidden Markov model; Limited training data; Expectation-maximization; Recursive estimation; HIDDEN MARKOV-MODELS; MAXIMUM-LIKELIHOOD-ESTIMATION; PROBABILISTIC FUNCTIONS; STATISTICAL COMPARISONS; ADAPTIVE ESTIMATION; ONLINE ESTIMATION; ALGORITHMS; CONVERGENCE; CLASSIFIERS; IDENTIFICATION;
D O I
10.1016/j.ins.2012.02.017
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The performance of Hidden Markov Models (HMMs) targeted for complex real-world applications are often degraded because they are designed a priori using limited training data and prior knowledge, and because the classification environment changes during operations. Incremental learning of new data sequences allows to adapt HMM parameters as new data becomes available, without having to retrain from the start on all accumulated training data. This paper presents a survey of techniques found in literature that are suitable for incremental learning of HMM parameters. These techniques are classified according to the objective function, optimization technique and target application, involving block-wise and symbol-wise learning of parameters. Convergence properties of these techniques are presented along with an analysis of time and memory complexity. In addition, the challenges faced when these techniques are applied to incremental learning is assessed for scenarios in which the new training data is limited and abundant. While the convergence rate and resource requirements are critical factors when incremental learning is performed through one pass over abundant stream of data, effective stopping criteria and management of validation sets are important when learning is performed through several iterations over limited data. In both cases managing the learning rate to integrate preexisting knowledge and new data is crucial for maintaining a high level of performance. Finally, this paper underscores the need for empirical benchmarking studies among techniques presented in literature, and proposes several evaluation criteria based on non-parametric statistical testing to facilitate the selection of techniques given a particular application domain. (C) 2012 Elsevier Inc. All rights reserved.
引用
收藏
页码:105 / 130
页数:26
相关论文
共 101 条
[1]   From Wiener to hidden Markov models [J].
Anderson, BDO .
IEEE CONTROL SYSTEMS MAGAZINE, 1999, 19 (03) :41-51
[2]  
[Anonymous], 1995, Hidden Markov Models: Estimation and Control
[3]  
[Anonymous], 2000, LECT NOTES COMPUTER
[4]   ANALYSIS OF AN IDENTIFICATION ALGORITHM ARISING IN THE ADAPTIVE ESTIMATION OF MARKOV-CHAINS [J].
ARAPOSTATHIS, A ;
MARCUS, SI .
MATHEMATICS OF CONTROL SIGNALS AND SYSTEMS, 1990, 3 (01) :1-29
[5]   A MAXIMUM-LIKELIHOOD APPROACH TO CONTINUOUS SPEECH RECOGNITION [J].
BAHL, LR ;
JELINEK, F ;
MERCER, RL .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1983, 5 (02) :179-190
[6]   SMOOTH ONLINE LEARNING ALGORITHMS FOR HIDDEN MARKOV-MODELS [J].
BALDI, P ;
CHAUVIN, Y .
NEURAL COMPUTATION, 1994, 6 (02) :307-318
[7]  
Baum L. E., 1972, Inequalities, V3, P1
[8]   STATISTICAL INFERENCE FOR PROBABILISTIC FUNCTIONS OF FINITE STATE MARKOV CHAINS [J].
BAUM, LE ;
PETRIE, T .
ANNALS OF MATHEMATICAL STATISTICS, 1966, 37 (06) :1554-&
[9]   A MAXIMIZATION TECHNIQUE OCCURRING IN STATISTICAL ANALYSIS OF PROBABILISTIC FUNCTIONS OF MARKOV CHAINS [J].
BAUM, LE ;
PETRIE, T ;
SOULES, G ;
WEISS, N .
ANNALS OF MATHEMATICAL STATISTICS, 1970, 41 (01) :164-&
[10]  
Bengio Y., 1999, Neural Computing Surveys, V2