Co-whitening of i-vectors for short and long duration speaker verification

被引：0

作者：

Xu, Longting ^{[1
]}

Lee, Kong Aik ^{[2
]}

Li, Haizhou ^{[1
]}

Yang, Zhen ^{[3
]}

机构：

[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore

[2] NEC Corp Ltd, Data Sci Res Labs, Tokyo, Japan

[3] Nanjing Univ Posts & Telecommun, Broadband Wireless Commun & Sensor Network Techno, Nanjing, Jiangsu, Peoples R China

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

关键词：

Speaker recognition; co-whitening; short duration; i-vector; text-independent; canonical correlation analysis;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An i-vector is a fixed-length and low-rank representation of a speech utterance. It has been used extensively in text independent speaker verification. Ideally, speech utterances from the same speaker would map to an unique i-vector. However, this is not the case due to some intrinsic and extrinsic factors like physical condition of the speaker, channel difference, noise and notably the duration of speech utterances. In particular, we found that i-vectors extracted from short utterances exhibit larger variance than that of long utterances. To address the problem, we propose a co-whitening approach, taking into account the duration, while maximizing the correlation between the i-vectors of short and long duration. The proposed co-whitening method was derived based on canonical correlation analysis (CCA). Experimental results on NIST SRE 2010 show that co-whitening method is effective in compensating the duration mismatch, leading to a reduction of up to 13.07% in equal error rate (EER).

引用

页码：1066 / 1070

页数：5

共 50 条

[31] Text-dependent speaker verification based on i-vectors, Neural Networks and Hidden Markov Models
Zeinali, Hossein
Sameti, Hossein
Burget, Lukas
Cernocky, Jan Honza
COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 53 - 71
[32] Client-wise cohort set selection by combining speaker- and phoneme-specific I-vectors for speaker verification
Waquar Ahmad
Harish Karnick
Rajesh M. Hegde
Multimedia Tools and Applications, 2018, 77 : 8273 - 8294
[33] Development of Speaker Recognizer Using I-vectors in Two Programming Environments
Jakubec, Maros
Lieskovska, Eva
Jarina, Roman
PROCEEDINGS OF THE 2020 CONFERENCE ON NEW TRENDS IN SIGNAL PROCESSING (NTSP), 2020, : 34 - 38
[34] Minimax i-vector extractor for short duration speaker verification
Hautamaki, Ville
Cheng, You-Chi
Rajan, Padmanabhan
Lee, Chin-Hui
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3675 - 3679
[35] Intersession compensation and scoring methods in the i-vectors space for speaker recognition
Bousquet, Pierre-Michel
Matrouf, Driss
Bonastre, Jean-Francois
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 492 - 495
[36] Native Accent Classification via I-Vectors and Speaker Compensation Fusion
DeMarco, Andrea
Cox, Stephen J.
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1471 - 1475
[37] ROBUST SPEAKER RECOGNITION BASED ON DNN/I-VECTORS AND SPEECH SEPARATION
Chang, Jorge
Wang, DeLiang
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5415 - 5419
[38] Speaker Adaptation of Neural Network Acoustic Models Using I-Vectors
Saon, George
Soltau, Hagen
Nahamoo, David
Picheny, Michael
2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 55 - 59
[39] Deep Speaker Embeddings for Short-Duration Speaker Verification
Bhattacharya, Gautam
Alam, Jahangir
Kenny, Patrick
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1517 - 1521
[40] Nonparametrically trained PLDA for short duration i-vector speaker verification
Khosravani, Abbas
Homayounpour, Mohammad M.
COMPUTER SPEECH AND LANGUAGE, 2018, 52 : 105 - 122

← 1 2 3 4 5 →