Analysis of I-Vector framework for Speaker Identification in TV-shows

被引：0

作者：

Fredouille, Corinne ^{[1
]}

Charlet, Delphine ^{[2
]}

机构：

[1] Univ Avignon, CERI LIA, Avignon, France

[2] Orange Labs, Lannion, France

来源：

15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年

关键词：

speaker identification; i-vector; REPERE challenge; TV shows;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Inspired from the Joint Factor Analysis, the I-vector-based analysis has become the most popular and state-of-the-art framework for the speaker verification task. Mainly applied within the NIST/SRE evaluation campaigns, many studies have been proposed to improve more and more performance of speaker verification systems. Nevertheless, while the i-vector framework has been used in other speech processing fields like language recognition, a very few studies have been reported for the speaker identification task on TV shows. This work was done in the REPERE challenge context, focused on the people recognition task in multimodal conditions (audio, video, text) from TV show corpora. Moreover, the challenge participants are invited for providing systems for monomodal tasks, like speaker identification. The application of the i-vector framework is investigatedthrough different points of views: (1) some of the i-vector based approaches are compared, (2) a specific i-vector extraction protocol is proposed in order to deal with widely varying amounts of training data among speaker population, (3) the joint use of both speaker diarization and identification is finally analyzed. Based on a 533 speaker dictionary, this joint system wins the monomodal speaker identification task of the 2014 REPERE challenge.

引用

页码：71 / 75

页数：5

共 16 条

[1]

[Anonymous], 2010, P OD 2010

[2]

[Anonymous], 2011, INTERSPEECH

[3]

[Anonymous], P INT C COMP VIS ICC

[4]

[Anonymous], P OD SPEAK LANG REC

[5]

Bousquet P.-M., 2011, P INT C SPEECH COMM

[6]

Brummer N., 2011, NIST SRE AN WORKSH

[7]

Campbell W. M., 2006, P ICASSP TOUL

[8]

Charlet D., 2013, P ICASSP

[9]

Dehak N, 2010, ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, P71

[10] Front-End Factor Analysis for Speaker Verification [J].

Dehak, Najim ;

Kenny, Patrick J. ;

Dehak, Reda ;

Dumouchel, Pierre ;

Ouellet, Pierre .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04) :788-798

← 1 2 →