Irrelevant variability normalization based HMM training using map estimation of feature transforms for robust speech recognition

被引：0

作者：

Zhu, Donglai ^{[1
]}

Huo, Qiang ^{[2
]}

机构：

[1] Inst Infocomm Res, Singapore, Singapore

[2] Microsoft Res Asia, Beijing, Peoples R China

来源：

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 | 2008年

关键词：

robust speech recognition; feature transformation; MAP estimate; hidden Markov model;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In the past several years, we've been studying feature transformation (FT) approaches to robust automatic speech recognition (ASR) which can compensate for possible "distortions" caused by factors irrelevant to phonetic classification in both training and recognition stages. Several FT functions with different degrees of flexibility have been studied and the corresponding maximum likelihood (ML) training techniques developed. In this paper, we study yet another new FT function which takes the most flexible form of frame-dependent linear transformation. Maximum a posteriori (MAP) estimation is used for estimating FT function parameters to deal with the possible problem of insufficient training data caused by the increased number of model parameters. The effectiveness of the proposed approach is confirmed by evaluation experiments on Finnish Aurora3 database.

引用

页码：4717 / +

页数：2

共 11 条

[1] [Anonymous], 2002, ETSI ES
[2] *AUR, 1999, AU21799 NOK
[3] Maximum likelihood linear transformations for HMM-based speech recognition
Gales, MJF
[J]. COMPUTER SPEECH AND LANGUAGE, 1998, 12 (02) : 75 - 98
[4] HIRSCH HG, 2000, ASR 2000, P181
[5] Huo Q, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P1129
[6] Structural maximum a posteriori linear regression for fast HMM adaptation
Siohan, O
Myrvoll, TA
Lee, CH
[J]. COMPUTER SPEECH AND LANGUAGE, 2002, 16 (01) : 5 - 24
[7] WU J, P ICSLP 2004, P2813
[8] WU J, P ICASSP 2005, P429
[9] An environment-compensated minimum classification error training approach based on stochastic vector mapping
Wu, Jian
Huo, Qiang
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (06): : 2147 - 2155
[10] Young S., 2005, HTK BOOK HTK VERSION

← 1 2 →