Improving the Performance of Far-Field Speaker Verification Using Multi-Condition Training: The Case of GMM-UBM and i-vector Systems

被引：0

作者：

Avila, Anderson R. ^{[1
,2
]}

Sarria-Paja, Milton ^{[1
]}

Fraga, Francisco J. ^{[2
]}

O'Shaughnessy, Douglas ^{[1
]}

Falk, Tiago H. ^{[1
]}

机构：

[1] Univ Quebec, INRS EMT, Montreal, PQ, Canada

[2] Univ Fed ABC UFABC, Sao Paulo, Brazil

来源：

15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年

关键词：

Automatic speaker verification; GMM-UBM; i-vector; far-field; reverberation time; IDENTIFICATION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

While considerable work has been done to characterize the detrimental effects of channel variability on automatic speaker verification (ASV) performance, little attention has been paid to the effects of room reverberation. This paper investigates the effects of room acoustics on the performance of two far-field ASV systems: GMM-UBM (Gaussian mixture model - universal background model) and i-vector. We show that ASV performance is severely affected by reverberation, particularly for i-vector based systems. Three multi-condition training methods are then investigated to mitigate such detrimental effects. The first uses matched train/test speaker models based on estimated reverberation time (RT) values. The second utilizes two condition training where clean and reverberant models are used. Lastly, a four-condition training setup is proposed where models for clean, mild, moderate, and severe reverberation levels are used. Experimental results show the first and third multi condition training methods providing significant gains in performance relative to the baseline, with the latter being more suitable for practical resource-constrained far-field applications.

引用

页码：1096 / 1100

页数：5

共 28 条

[1]

[Anonymous], REAL WORLD SPEECH PR

[2]

[Anonymous], TECH REP

[3]

[Anonymous], P IEEE OD SPEAK LANG

[4]

[Anonymous], P IEEE INSTR MEAS TE

[5]

[Anonymous], MSRTR2013133 CSRC

[6]

[Anonymous], 2006, P SPECOM

[7] A tutorial on text-independent speaker verification [J].

Bimbot, F ;

Bonastre, JF ;

Fredouille, C ;

Gravier, G ;

Magrin-Chagnolleau, I ;

Meignier, S ;

Merlin, T ;

Ortega-García, J ;

Petrovska-Delacrétaz, D ;

Reynolds, DA .

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (04) :430-451

[8] Front-End Factor Analysis for Speaker Verification [J].

Dehak, Najim ;

Kenny, Patrick J. ;

Dehak, Reda ;

Dumouchel, Pierre ;

Ouellet, Pierre .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04) :788-798

[9]

Dehak N, 2009, INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, P1527

[10]

Falk TH, 2008, INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, P634

← 1 2 3 →