Multi-Scale Kernels for Short Utterance Speaker Recognition

被引:0
|
作者
Zhang, Wei-Qiang [1 ]
Zhao, Junhong [2 ,3 ]
Zhang, Wen-Lin [4 ]
Liu, Jia [1 ]
机构
[1] Tsinghua Univ, Tsinghua Natl Lab Informat Sci & Technol, Dept Elect Engn, Beijing 100084, Peoples R China
[2] Chinese Acad Sci, Inst Elect, State Key Lab Transducer Technol, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Beijing 100190, Peoples R China
[4] Zhengzhou Informat Sci & Technol Inst, Zhengzhou 450002, Peoples R China
来源
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2014年
基金
中国国家自然科学基金;
关键词
speaker recognition; short utterance; multi-scale kernel;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Short utterance is a great challenge for speaker recognition, for there is very limited data can be used for training and testing. To give a robust estimation, the amount of model parameters for the short utterance should be less than that for the long utterance; however, this may impede the models descriptive capability. In this paper, we propose a multi-scale kernel (MSK) approach to solve this problem. We construct a series of kernels with different scales, and combine them through multiple kernel learning (MKL) optimization. In this way, the robustness and scalability of the model will be both enhanced. The experimental results on NIST SRE 2010 10sec-10sec dataset show that the proposed MSK method outperforms the traditional Gaussian mixture model supervector (GSV) followed by support vector machine (SVM) method.
引用
收藏
页码:414 / +
页数:2
相关论文
共 50 条
  • [31] Scale-invariant MFCCs for speech/speaker recognition
    Tufekci, Zekeriya
    Disken, Gokay
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2019, 27 (05) : 3758 - 3762
  • [32] SPEAKER RECOGNITION FOR MULTI-SPEAKER CONVERSATIONS USING X-VECTORS
    Snyder, David
    Garcia-Romero, Daniel
    Sell, Gregory
    McCree, Alan
    Povey, Daniel
    Khudanpur, Sanjeev
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5796 - 5800
  • [33] I-vector Transformation Using Conditional Generative Adversarial Networks for Short Utterance Speaker Verification
    Zhang, Jiacen
    Inoue, Nakamasa
    Shinoda, Koichi
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3613 - 3617
  • [34] I-Vector Extraction Using Speaker Relevancy for Short Duration Speaker Recognition
    Kang, Woo Hyun
    Cho, Won Ik
    Jang, Se Young
    Lee, Hyeon Seung
    Kim, Nam Soo
    IT CONVERGENCE AND SECURITY 2017, VOL 1, 2018, 449 : 79 - 87
  • [35] Speaker recognition system in multi-channel environment
    Sang, LF
    Wu, ZH
    Yang, YC
    2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 3116 - 3121
  • [36] A method of multi-models fusion for speaker recognition
    Wu H.
    Luo L.
    Peng H.
    Wen W.
    International Journal of Speech Technology, 2022, 25 (2) : 493 - 498
  • [37] Adversarial Training for Multi-domain Speaker Recognition
    Wang, Qing
    Rao, Wei
    Guo, Pengcheng
    Xie, Lei
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [38] Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework
    Nirmalya Sen
    Md Sahidullah
    Hemant A. Patil
    Shyamal Kumar Das Mandal
    Krothapalli Sreenivasa Rao
    Tapan Kumar Basu
    International Journal of Speech Technology, 2021, 24 : 1067 - 1088
  • [39] Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework
    Sen, Nirmalya
    Sahidullah, Md
    Patil, Hemant A.
    Das Mandal, Shyamal Kumar
    Rao, Krothapalli Sreenivasa
    Basu, Tapan Kumar
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (04) : 1067 - 1088
  • [40] i-vector Based Speaker Recognition on Short Utterances
    Kanagasundaram, Ahilan
    Vogt, Robbie
    Dean, David
    Sridharan, Sridha
    Mason, Michael
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2352 - +