Speaker verification using adapted Gaussian mixture models

被引：2924

作者：

Reynolds, DA ^{[1
]}

Quatieri, TF ^{[1
]}

Dunn, RB ^{[1
]}

机构：

[1] MIT, Lincoln Lab, Speech Syst Technol Grp, Lexington, MA 02420 USA

来源：

DIGITAL SIGNAL PROCESSING | 2000年 / 10卷 / 1-3期

关键词：

speaker recognition; Gaussian mixture models; likelihood ratio detector; universal background model; handset normalization; NIST evaluation;

D O I：

10.1006/dspr.1999.0361

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper we describe the major elements of MIT Lincoln Laboratory's Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations (SREs). The system is built around the likelihood ratio test for verification, using simple but effective GMMs for likelihood functions, a universal background model (UBM) for alternative speaker representation, and a form of Bayesian adaptation to derive speaker models from the UBM. The development and use of a handset detector and score normalization to greatly improve verification performance is also described and discussed. Finally representative performance benchmarks and system behavior experiments on NIST SRE corpora are presented. (C) 2000 Academic Press.

引用

页码：19 / 41

页数：23

共 33 条

[1]

A Reynolds D., 1992, GAUSSIAN MIXTURE MOD

[2]

[Anonymous], 1997, Proceedings of the uropean Conference on Speech Communication and Technology

[3]

CAREY MJ, 1991, P INT C AC SPEECH SI, P397

[4] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].

DEMPSTER, AP ;

LAIRD, NM ;

RUBIN, DB .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38

[5]

DODDINGTON G, IN PRESS SPEECH COMM

[6] Approaches to speaker detection and tracking in conversational speech [J].

Dunn, RB ;

Reynolds, DA ;

Quatieri, TF .

DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) :93-112

[7]

Fukunaga K., 1972, Introduction to statistical pattern recognition

[8] Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains [J].

Gauvain, Jean-Luc ;

Lee, Chin-Hui .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02) :291-298

[9]

Hart P.E., 1973, Pattern recognition and scene analysis

[10]

HECK LP, 1997, P ICASSP, P1071

← 1 2 3 4 →