Co-whitening of i-vectors for short and long duration speaker verification

被引:0
|
作者
Xu, Longting [1 ]
Lee, Kong Aik [2 ]
Li, Haizhou [1 ]
Yang, Zhen [3 ]
机构
[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
[2] NEC Corp Ltd, Data Sci Res Labs, Tokyo, Japan
[3] Nanjing Univ Posts & Telecommun, Broadband Wireless Commun & Sensor Network Techno, Nanjing, Jiangsu, Peoples R China
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
关键词
Speaker recognition; co-whitening; short duration; i-vector; text-independent; canonical correlation analysis;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An i-vector is a fixed-length and low-rank representation of a speech utterance. It has been used extensively in text independent speaker verification. Ideally, speech utterances from the same speaker would map to an unique i-vector. However, this is not the case due to some intrinsic and extrinsic factors like physical condition of the speaker, channel difference, noise and notably the duration of speech utterances. In particular, we found that i-vectors extracted from short utterances exhibit larger variance than that of long utterances. To address the problem, we propose a co-whitening approach, taking into account the duration, while maximizing the correlation between the i-vectors of short and long duration. The proposed co-whitening method was derived based on canonical correlation analysis (CCA). Experimental results on NIST SRE 2010 show that co-whitening method is effective in compensating the duration mismatch, leading to a reduction of up to 13.07% in equal error rate (EER).
引用
收藏
页码:1066 / 1070
页数:5
相关论文
共 50 条
  • [31] Text-dependent speaker verification based on i-vectors, Neural Networks and Hidden Markov Models
    Zeinali, Hossein
    Sameti, Hossein
    Burget, Lukas
    Cernocky, Jan Honza
    COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 53 - 71
  • [32] Client-wise cohort set selection by combining speaker- and phoneme-specific I-vectors for speaker verification
    Waquar Ahmad
    Harish Karnick
    Rajesh M. Hegde
    Multimedia Tools and Applications, 2018, 77 : 8273 - 8294
  • [33] Development of Speaker Recognizer Using I-vectors in Two Programming Environments
    Jakubec, Maros
    Lieskovska, Eva
    Jarina, Roman
    PROCEEDINGS OF THE 2020 CONFERENCE ON NEW TRENDS IN SIGNAL PROCESSING (NTSP), 2020, : 34 - 38
  • [34] Minimax i-vector extractor for short duration speaker verification
    Hautamaki, Ville
    Cheng, You-Chi
    Rajan, Padmanabhan
    Lee, Chin-Hui
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3675 - 3679
  • [35] Intersession compensation and scoring methods in the i-vectors space for speaker recognition
    Bousquet, Pierre-Michel
    Matrouf, Driss
    Bonastre, Jean-Francois
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 492 - 495
  • [36] Native Accent Classification via I-Vectors and Speaker Compensation Fusion
    DeMarco, Andrea
    Cox, Stephen J.
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1471 - 1475
  • [37] ROBUST SPEAKER RECOGNITION BASED ON DNN/I-VECTORS AND SPEECH SEPARATION
    Chang, Jorge
    Wang, DeLiang
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5415 - 5419
  • [38] Speaker Adaptation of Neural Network Acoustic Models Using I-Vectors
    Saon, George
    Soltau, Hagen
    Nahamoo, David
    Picheny, Michael
    2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 55 - 59
  • [39] Deep Speaker Embeddings for Short-Duration Speaker Verification
    Bhattacharya, Gautam
    Alam, Jahangir
    Kenny, Patrick
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1517 - 1521
  • [40] Nonparametrically trained PLDA for short duration i-vector speaker verification
    Khosravani, Abbas
    Homayounpour, Mohammad M.
    COMPUTER SPEECH AND LANGUAGE, 2018, 52 : 105 - 122