Online hierarchical transformation of hidden Markov models for speech recognition

被引:31
作者
Chien, JT [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1999年 / 7卷 / 06期
关键词
approximate Bayesian estimate; EM algorithm; hidden Markov models; online hierarchical transformation; speaker adaptation; speech recognition;
D O I
10.1109/89.799691
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a novel framework of online hierarchical transformation of hidden Markov model (HMM) parameters for adaptive speech recognition. Our goal is to incrementally transform (or adapt) all the HMM parameters to a new acoustical environment even though most of HMM units are unseen in observed adaptation data. We establish a hierarchical tree of HMM units and apply the tree to dynamically search the transformation parameters for individual HMM mixture components. In this paper, the transformation framework is formulated according to the approximate Bayesian estimate, which the prior statistics and the transformation parameters can be jointly and incrementally refreshed after each consecutive adaptation data is presented. Using this formulation, only the refreshed prior statistics and the current block of data are needed for online transformation. In a series of speaker adaptation experiments on the recognition of 408 Mandarin syllables, we examine the effects on constructing various types of hierarchical trees. The efficiency and effectiveness of proposed method on incremental adaptation of overall HMM units are also confirmed. Besides, we demonstrate the superiority of proposed online transformation to Hue's on-line adaptation [16] for a wide range of adaptation data.
引用
收藏
页码:656 / 667
页数:12
相关论文
共 43 条
[1]  
ABRASH V, 1996, P IEEE INT C AC SPEE, P729
[2]  
[Anonymous], AUTOMATIC SPEECH SPE
[3]   SMOOTH ONLINE LEARNING ALGORITHMS FOR HIDDEN MARKOV-MODELS [J].
BALDI, P ;
CHAUVIN, Y .
NEURAL COMPUTATION, 1994, 6 (02) :307-318
[4]   A hybrid algorithm for speaker adaptation using MAP transformation and adaptation [J].
Chien, JT ;
Lee, CH ;
Wang, HC .
IEEE SIGNAL PROCESSING LETTERS, 1997, 4 (06) :167-169
[5]   Telephone speech recognition based on Bayesian adaptation of hidden Markov models [J].
Chien, JT ;
Wang, HC .
SPEECH COMMUNICATION, 1997, 22 (04) :369-384
[6]  
CHIEN JT, 1997, P 5 EUR C SPEECH COM, V5, P2575
[7]  
CHIEN JT, 1997, P 1997 INT C AC SPEE, V2, P1027
[8]  
DeGroot M., 1970, OPTIMAL STAT DECISIO
[9]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[10]  
DIGALAKIS V, 1997, P 5 EUR C SPEECH COM, V4, P1859