UBM based speaker selection and model re-estimation for speaker adaptation

被引：0

作者：

Wang, Jian ^{[1
]}

Guo, Jun ^{[1
]}

Liu, Gang ^{[1
]}

Lei, Jianjun ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Sch Informat Engn, Wangjian 200810, Peoples R China

来源：

PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2 | 2006年

基金：

中国国家自然科学基金;

关键词：

speaker adaptation; speaker selection; UBM;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Based on speaker selection, speaker adaptation technology can get a promising performance. In such system, how to represent a speaker and the computation of selection are still big issues. In this paper, we take Gaussian mixture model (GMM) as representation of a speaker, which adapted from universal background model (UBM). Likelihood ratio (LR) and cross likelihood ratio (CLR) are utilized for speaker selection. Furthermore, a single-pass re-estimation procedure, conditioned on the speaker-independent model is shown. This adaptation strategy was evaluated in a large vocabulary speech recognition task. A relative gain of 11% with respect to the baseline system is achieved.

引用

页码：856 / 860

页数：5

共 8 条

[1]

FISCHER V, 2002, P ICSLP

[2] Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains [J].

Gauvain, Jean-Luc ;

Lee, Chin-Hui .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02) :291-298

[3]

JIANG WU, 2001, P EUR, P1261

[4] MAXIMUM-LIKELIHOOD LINEAR-REGRESSION FOR SPEAKER ADAPTATION OF CONTINUOUS DENSITY HIDDEN MARKOV-MODELS [J].

LEGGETTER, CJ ;

WOODLAND, PC .

COMPUTER SPEECH AND LANGUAGE, 1995, 9 (02) :171-185

[5] Speaker clustering and transformation for speaker adaptation in speech recognition systems [J].

Padmanabhan, M ;

Bahl, LR ;

Nahamoo, D ;

Picheny, MA .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (01) :71-77

[6] ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS [J].

REYNOLDS, DA ;

ROSE, RC .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01) :72-83

[7] Speaker verification using adapted Gaussian mixture models [J].

Reynolds, DA ;

Quatieri, TF ;

Dunn, RB .

DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) :19-41

[8]

SANKAR A, 1995, P EUR, P502

← 1 →