Multi-Scale Kernels for Short Utterance Speaker Recognition

被引:0
|
作者
Zhang, Wei-Qiang [1 ]
Zhao, Junhong [2 ,3 ]
Zhang, Wen-Lin [4 ]
Liu, Jia [1 ]
机构
[1] Tsinghua Univ, Tsinghua Natl Lab Informat Sci & Technol, Dept Elect Engn, Beijing 100084, Peoples R China
[2] Chinese Acad Sci, Inst Elect, State Key Lab Transducer Technol, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Beijing 100190, Peoples R China
[4] Zhengzhou Informat Sci & Technol Inst, Zhengzhou 450002, Peoples R China
来源
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2014年
基金
中国国家自然科学基金;
关键词
speaker recognition; short utterance; multi-scale kernel;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Short utterance is a great challenge for speaker recognition, for there is very limited data can be used for training and testing. To give a robust estimation, the amount of model parameters for the short utterance should be less than that for the long utterance; however, this may impede the models descriptive capability. In this paper, we propose a multi-scale kernel (MSK) approach to solve this problem. We construct a series of kernels with different scales, and combine them through multiple kernel learning (MKL) optimization. In this way, the robustness and scalability of the model will be both enhanced. The experimental results on NIST SRE 2010 10sec-10sec dataset show that the proposed MSK method outperforms the traditional Gaussian mixture model supervector (GSV) followed by support vector machine (SVM) method.
引用
收藏
页码:414 / +
页数:2
相关论文
共 50 条
  • [1] A short utterance speaker recognition method with improved cepstrum–CNN
    Yongfeng Li
    Shuaishuai Chang
    QingE Wu
    SN Applied Sciences, 2022, 4
  • [2] Multi-resolution time frequency feature and complementary combination for short utterance speaker recognition
    Zhi-Yi Li
    Wei-Qiang Zhang
    Jia Liu
    Multimedia Tools and Applications, 2015, 74 : 937 - 953
  • [3] Multi-resolution time frequency feature and complementary combination for short utterance speaker recognition
    Li, Zhi-Yi
    Zhang, Wei-Qiang
    Liu, Jia
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (03) : 937 - 953
  • [4] A short utterance speaker recognition method with improved cepstrum-CNN
    Li, Yongfeng
    Chang, Shuaishuai
    Wu, QingE
    SN APPLIED SCIENCES, 2022, 4 (12):
  • [5] Improving Short Utterance Speaker Recognition by Modeling Speech Unit Classes
    Li, Lantian
    Wang, Dong
    Zhang, Chenhao
    Zheng, Thomas Fang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (06) : 1129 - 1139
  • [6] ASTT: acoustic spatial-temporal transformer for short utterance speaker recognition
    Wu, Xing
    Li, Ruixuan
    Deng, Bin
    Zhao, Ming
    Du, Xingyue
    Wang, Jianjia
    Ding, Kai
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (21) : 33039 - 33061
  • [7] ASTT: acoustic spatial-temporal transformer for short utterance speaker recognition
    Xing Wu
    Ruixuan Li
    Bin Deng
    Ming Zhao
    Xingyue Du
    Jianjia Wang
    Kai Ding
    Multimedia Tools and Applications, 2023, 82 : 33039 - 33061
  • [8] SHORT UTTERANCE SPEAKER RECOGNITION BY RESERVOIR WITH SELF-ORGANIZED MAPPING
    Ikeda, Narumitsu
    Sato, Yoshinao
    Takahashi, Hirokazu
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 1073 - 1077
  • [9] Speaker recognition based on short utterance compensation method of generative adversarial networks
    Hu, Zhangfang
    Fu, Yaqin
    Luo, Yuan
    Xu, Xuan
    Xia, Zhiguang
    Zhang, Hongwei
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (02) : 443 - 450
  • [10] Speaker recognition based on short utterance compensation method of generative adversarial networks
    Zhangfang Hu
    Yaqin Fu
    Yuan Luo
    Xuan Xu
    Zhiguang Xia
    Hongwei Zhang
    International Journal of Speech Technology, 2020, 23 : 443 - 450