Client-wise cohort set selection by combining speaker- and phoneme-specific I-vectors for speaker verification

被引：0

作者：

Waquar Ahmad

Harish Karnick

Rajesh M. Hegde

机构：

[1] NIT Sikkim,Department of ECE

[2] IIT Kanpur,Department of Computer Science and Engineering

[3] IIT Kanpur,Department of Electrical Engineering

来源：

Multimedia Tools and Applications | 2018年 / 77卷

关键词：

Speaker verification; Speaker recognition; Cohort selection;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

This work explores the use of phoneme level information in cohort selection to improve the performance of a speaker verification system. In speaker verification, cohort is used in score normalization to get a better performance. Score normalization is a technique to reduce the undesirable variation arising from acoustically mismatched conditions. Proper selection of cohort significantly improves speaker verification performance. In this paper, we investigate cohort selection based on a speaker model cluster under the i-vector framework that we call the i-vector model cluster (IMC). Two approaches for cohort selection are proposed. First approach utilizes speaker specific properties and called speaker specific cohort selection (SSCS). In this approach, speaker level information is used for cohort selection. The second approach is phoneme specific cohort selection (PSCS). This method improves cohort set selection by using phoneme level information. Phoneme level information is further employed in a late fusion approach that uses a majority voting method on normalized scores to improve the performance of the speaker verification system. Speaker verification experiments were conducted using the TIMIT, HINDI and YOHO databases. An equal error rate improvement of 19.01%, 14.61% and 19.4%is obtained for the proposed method compared to the standard ZT-Norm method for TIMIT, HINDI and YOHO datasets. Reasonable improvements in performance are also obtained in terms of minimum decision cost function (min DCF) and detection error trade-off (DET) curves.

引用

页码：8273 / 8294

页数：21

共 42 条

[1]

Apsingekar V(2009)Speaker model clustering for efficient speaker identification in large population applications IEEE Trans Acoust Speech Signal Process 17 848-853

[2]

DeLeon P(2011)Speaker verification score normalization using speaker model clusters Speech Comm 53 110-118

[3]

Apsingekar V(2000)Score normalization for text-independent speaker verification systems Digital Signal Process 10 42-54

[4]

DeLeon P(2004)A tutorial on text-independent speaker verification EURASIP J Appl Signal Proc 2004 430-451

[5]

Auckenthaler R(1997)Speaker recognition: A tutorial Proc IEEE 85 1437-1462

[6]

Carey M(2006)Support vector machines using gmm supervectors for speaker verification Signal Proc Lett IEEE 13 308-311

[7]

Lloyd-Thomas H(1970)An iterative procedure for estimation in contingency tables Annals of Mathematical Statistics 41 907-917

[8]

Bimbot F(1995)On the effective implementation of the iterative proportional fitting procedure Comput Stat Data Anal 19 177-189

[9]

Bonastre J-F(2005)Joint factor analysis of speaker and session variability: Theory and algorithms CRIM Montreal (Report) CRIM 06 8-13

[10]

Fredouille C(2011)Front-end factor analysis for speaker verification IEEE Trans Audio Speech Lang Process 19 788-798

← 1 2 3 4 5 →