PLDA FOR SPEAKER VERIFICATION WITH UTTERANCES OF ARBITRARY DURATION

被引：0

作者：

Kenny, Patrick

Stafylakis, Themos

Ouellet, Pierre

Alam, Md Jahangir

Dumouchel, Pierre

机构：

来源：

2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年

关键词：

speaker recognition; i-vectors; PLDA;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The duration of speech segments has traditionally been controlled in the NIST speaker recognition evaluations so that researchers working in this framework have been relieved of the responsibility of dealing with the duration variability that arises in practical applications. The fixed dimensional i-vector representation of speech utterances is ideal for working under such controlled conditions and ignoring the fact that i-vectors extracted from short utterances are less reliable than those extracted from long utterances leads to a very simple formulation of the speaker recognition problem. However a more realistic approach seems to be needed to handle duration variability properly. In this paper, we show how to quantify the uncertainty associated with the i-vector extraction process and propagate it into a PLDA classifier. We evaluated this approach using test sets derived from the NIST 2010 core and extended core conditions by randomly truncating the utterances in the female, telephone speech trials so that the durations of all enrollment and test utterances lay in the range 3-60 seconds and we found that it led to substantial improvements in accuracy. Although the likelihood ratio computation for speaker verification is more computationally expensive than in the standard i-vector/PLDA classifier, it is still quite modest as it reduces to computing the probability density functions of two full covariance Gaussians (irrespective of the number of the number of utterances used to enroll a speaker).

引用

页码：7649 / 7653

页数：5

共 50 条

[41] Joint Estimation of PLDA and Nonlinear Transformations of Speaker Vectors
Cumani, Sandro
Laface, Pietro
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (10) : 1890 - 1900
[42] Improved i-vector extraction technique for speaker verification with short utterances
Poddar A.
Sahidullah M.
Saha G.
International Journal of Speech Technology, 2018, 21 (03) : 473 - 488
[43] Denoising autoencoder-based speaker feature restoration for utterances of short duration
Yamamoto, Hitoshi
Koshinaka, Takafumi
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1052 - 1056
[44] DEEP NEURAL NETWORK BASED DISCRIMINATIVE TRAINING FOR I-VECTOR/PLDA SPEAKER VERIFICATION
Zheng Tieran
Han Jiqing
Zheng Guibin
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5354 - 5358
[45] DOMAIN ADAPTATION USING MAXIMUM LIKELIHOOD LINEAR TRANSFORMATION FOR PLDA-BASED SPEAKER VERIFICATION
Wang, Qiongqiong
Yamamoto, Hitoshi
Koshinaka, Takafumi
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5110 - 5114
[46] Turkish Text-Dependent Speaker Verification using i-vector/PLDA Approach
Hanilci, Cemal
Celiktas, Havva
2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
[47] Modified-prior PLDA and Score Calibration for Duration Mismatch Compensation in Speaker Recognition System
Hong, QingYang
Li, Lin
Li, Ming
Huang, Ling
Wan, Lihong
Zhang, Jun
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1037 - 1041
[48] Iterative PLDA Adaptation for Speaker Diarization
Le Lan, Gael
Charlet, Delphine
Larcher, Anthony
Meignier, Sylvain
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2175 - 2179
[49] Duration and Pronunciation Conditioned Lexical Modeling for Speaker Verification
Tur, Gokhan
Shriberg, Elizabeth
Stolcke, Andreas
Kajarekar, Sachin
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2664 - 2667
[50] Blind score normalization method for PLDA based speaker recognition
Doroshin, Danila
Lubimov, Nikolay
Nastasenko, Marina
Kotov, Mikhail
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 210 - 213

← 1 2 3 4 5 →