CNN-based joint mapping of short and long utterance i-vectors for speaker verification using short utterances

被引:9
|
作者
Guo, Jinxi [1 ]
Nookala, Usha Amrutha [1 ]
Alwan, Abeer [1 ]
机构
[1] Univ Calif Los Angeles, Dept Elect Engn, Los Angeles, CA 90095 USA
来源
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年
关键词
speaker verification; text-independent; short utterances; i-vectors; CNNs; joint modeling; PLDA;
D O I
10.21437/Interspeech.2017-430
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-independent speaker recognition using short utterances is a highly challenging task due to the large variation and content mismatch between short utterances. I-vector and probabilistic linear discriminant analysis (PLDA) based systems have become the standard in speaker verification applications, but they are less effective with short utterances. To address this issue, we propose a novel method. which trains a convolutional neural network (CNN) model to map the i-vectors extracted from short utterances to the corresponding long-utterance i-vectors. In order to simultaneously learn the representation of the original short-utterance i-vectors and fit the target long-version i-vectors. we jointly train a supervised-regression model with an autoencoder using CNNs. The trained CNN model is then used to generate the mapped version of short-utterance i-vectors in the evaluation stage. We compare our proposed CNN based joint mapping method with a GMM-based joint modeling method under matched and mismatched PLDA training conditions. Experimental results using the NIST SRE 2008 dataset show that the proposed technique achieves up to 30% relative improvement under duration mismatched PLDA-training conditions and outperforms the GMM-based method. The improved systems also perform better compared with the matched-length PLDA training condition using short utterances.
引用
收藏
页码:3712 / 3716
页数:5
相关论文
共 50 条
  • [21] Privacy-preserving speaker verification system based on binary I-vectors
    Mtibaa, Aymen
    Petrovska-Delacretaz, Dijana
    Boudy, Jerome
    Ben Hamida, Ahmed
    IET BIOMETRICS, 2021, 10 (03) : 233 - 245
  • [22] Improved i-vector extraction technique for speaker verification with short utterances
    Poddar A.
    Sahidullah M.
    Saha G.
    International Journal of Speech Technology, 2018, 21 (03) : 473 - 488
  • [23] i-vector Based Speaker Recognition on Short Utterances
    Kanagasundaram, Ahilan
    Vogt, Robbie
    Dean, David
    Sridharan, Sridha
    Mason, Michael
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2352 - +
  • [24] Novel Quality Metric for Duration Variability Compensation in Speaker Verification using i-Vectors
    Poddar, Arnab
    Sahidullah, Md
    Saha, Goutam
    2017 NINTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION (ICAPR), 2017, : 298 - 303
  • [25] Speaker verification using short utterances with DNN-based estimation of subglottal acoustic features
    Guo, Jinxi
    Yeung, Gary
    Muralidharan, Deepak
    Arsikere, Harish
    Afshan, Amber
    Alwan, Abeer
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2219 - 2222
  • [26] I-vector Transformation Using Conditional Generative Adversarial Networks for Short Utterance Speaker Verification
    Zhang, Jiacen
    Inoue, Nakamasa
    Shinoda, Koichi
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3613 - 3617
  • [27] Probabilistic approach using joint clean and noisy i-vectors modeling for speaker recognition
    Ben Kheder, Waad
    Matrouf, Driss
    Ajili, Moez
    Bonastre, Jean-Francois
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3638 - 3642
  • [28] Few-shot short utterance speaker verification using meta-learning
    Wang W.
    Zhao H.
    Yang Y.
    Chang Y.
    You H.
    PeerJ Computer Science, 2023, 9
  • [29] Few-shot short utterance speaker verification using meta-learning
    Wang, Weijie
    Zhao, Hong
    Yang, Yikun
    Chang, YouKang
    You, Haojie
    PEERJ COMPUTER SCIENCE, 2023, 9
  • [30] DNN i-vector Speaker Verification with Short, Text-constrained Test Utterances
    Zhong, Jinghua
    Hu, Wenping
    Soong, Frank
    Meng, Helen
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1507 - 1511