Multi-Scale Kernels for Short Utterance Speaker Recognition

被引:0
|
作者
Zhang, Wei-Qiang [1 ]
Zhao, Junhong [2 ,3 ]
Zhang, Wen-Lin [4 ]
Liu, Jia [1 ]
机构
[1] Tsinghua Univ, Tsinghua Natl Lab Informat Sci & Technol, Dept Elect Engn, Beijing 100084, Peoples R China
[2] Chinese Acad Sci, Inst Elect, State Key Lab Transducer Technol, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Beijing 100190, Peoples R China
[4] Zhengzhou Informat Sci & Technol Inst, Zhengzhou 450002, Peoples R China
来源
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2014年
基金
中国国家自然科学基金;
关键词
speaker recognition; short utterance; multi-scale kernel;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Short utterance is a great challenge for speaker recognition, for there is very limited data can be used for training and testing. To give a robust estimation, the amount of model parameters for the short utterance should be less than that for the long utterance; however, this may impede the models descriptive capability. In this paper, we propose a multi-scale kernel (MSK) approach to solve this problem. We construct a series of kernels with different scales, and combine them through multiple kernel learning (MKL) optimization. In this way, the robustness and scalability of the model will be both enhanced. The experimental results on NIST SRE 2010 10sec-10sec dataset show that the proposed MSK method outperforms the traditional Gaussian mixture model supervector (GSV) followed by support vector machine (SVM) method.
引用
收藏
页码:414 / +
页数:2
相关论文
共 50 条
  • [11] Short Utterance Speaker Recognition Using Time-Delay Neural Network
    Toruk, Muhammet Mesut
    Gokay, Ramazan
    2019 16TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2019, : 383 - 386
  • [12] EEG-Based Emotion Recognition by Convolutional Neural Network with Multi-Scale Kernels
    Phan, Tran-Dac-Thinh
    Kim, Soo-Hyung
    Yang, Hyung-Jeong
    Lee, Guee-Sang
    SENSORS, 2021, 21 (15)
  • [13] New approach for short utterance speaker identification
    Chakroun, Rania
    Frikha, Mondher
    Zouari, Leila Beltaifa
    IET SIGNAL PROCESSING, 2018, 12 (07) : 873 - 880
  • [14] An Analysis of the Short Utterance Problem for Speaker Characterization
    Vinals, Ignacio
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    APPLIED SCIENCES-BASEL, 2019, 9 (18):
  • [15] UTTERANCE-LEVEL AGGREGATION FOR SPEAKER RECOGNITION IN THE WILD
    Xie, Weidi
    Nagrani, Arsha
    Chung, Joon Son
    Zisserman, Andrew
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5791 - 5795
  • [16] Length- and Noise-aware Training Techniques for Short-utterance Speaker Recognition
    Chen, Wenda
    Huang, Jonathan
    Bocklet, Tobias
    INTERSPEECH 2020, 2020, : 3835 - 3839
  • [17] Utterance-Final Glottalization as a Cue for Familiar Speaker Recognition
    Bohm, Tamas
    Shattuck-Huffnagel, Stefanie
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1549 - +
  • [18] Short Utterance Speaker Recognition Based on Speech High Frequency Information Compensation and Dynamic Feature Enhancement Methods
    Zi, Yunfei
    Xiong, Shengwu
    ARCHIVES OF ACOUSTICS, 2024, 49 (01) : 37 - 48
  • [19] MFA: TDNN WITH MULTI-SCALE FREQUENCY-CHANNEL ATTENTION FOR TEXT-INDEPENDENT SPEAKER VERIFICATION WITH SHORT UTTERANCES
    Liu, Tianchi
    Das, Rohan Kumar
    Lee, Kong Aik
    Li, Haizhou
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7517 - 7521
  • [20] An Adaptive i-Vector Extraction for Speaker Verification with Short Utterance
    Poddar, Arnab
    Sahidullah, Md
    Saha, Goutam
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2017, 2017, 10597 : 326 - 332