Multi-Scale Kernels for Short Utterance Speaker Recognition

被引：0

作者：

Zhang, Wei-Qiang ^{[1
]}

Zhao, Junhong ^{[2
,3
]}

Zhang, Wen-Lin ^{[4
]}

Liu, Jia ^{[1
]}

机构：

[1] Tsinghua Univ, Tsinghua Natl Lab Informat Sci & Technol, Dept Elect Engn, Beijing 100084, Peoples R China

[2] Chinese Acad Sci, Inst Elect, State Key Lab Transducer Technol, Beijing 100190, Peoples R China

[3] Univ Chinese Acad Sci, Beijing 100190, Peoples R China

[4] Zhengzhou Informat Sci & Technol Inst, Zhengzhou 450002, Peoples R China

来源：

2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2014年

基金：

中国国家自然科学基金;

关键词：

speaker recognition; short utterance; multi-scale kernel;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Short utterance is a great challenge for speaker recognition, for there is very limited data can be used for training and testing. To give a robust estimation, the amount of model parameters for the short utterance should be less than that for the long utterance; however, this may impede the models descriptive capability. In this paper, we propose a multi-scale kernel (MSK) approach to solve this problem. We construct a series of kernels with different scales, and combine them through multiple kernel learning (MKL) optimization. In this way, the robustness and scalability of the model will be both enhanced. The experimental results on NIST SRE 2010 10sec-10sec dataset show that the proposed MSK method outperforms the traditional Gaussian mixture model supervector (GSV) followed by support vector machine (SVM) method.

引用

页码：414 / +

页数：2

共 50 条

[11] Short Utterance Speaker Recognition Using Time-Delay Neural Network
Toruk, Muhammet Mesut
Gokay, Ramazan
2019 16TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2019, : 383 - 386
[12] EEG-Based Emotion Recognition by Convolutional Neural Network with Multi-Scale Kernels
Phan, Tran-Dac-Thinh
Kim, Soo-Hyung
Yang, Hyung-Jeong
Lee, Guee-Sang
SENSORS, 2021, 21 (15)
[13] New approach for short utterance speaker identification
Chakroun, Rania
Frikha, Mondher
Zouari, Leila Beltaifa
IET SIGNAL PROCESSING, 2018, 12 (07) : 873 - 880
[14] An Analysis of the Short Utterance Problem for Speaker Characterization
Vinals, Ignacio
Ortega, Alfonso
Miguel, Antonio
Lleida, Eduardo
APPLIED SCIENCES-BASEL, 2019, 9 (18):
[15] UTTERANCE-LEVEL AGGREGATION FOR SPEAKER RECOGNITION IN THE WILD
Xie, Weidi
Nagrani, Arsha
Chung, Joon Son
Zisserman, Andrew
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5791 - 5795
[16] Length- and Noise-aware Training Techniques for Short-utterance Speaker Recognition
Chen, Wenda
Huang, Jonathan
Bocklet, Tobias
INTERSPEECH 2020, 2020, : 3835 - 3839
[17] Utterance-Final Glottalization as a Cue for Familiar Speaker Recognition
Bohm, Tamas
Shattuck-Huffnagel, Stefanie
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1549 - +
[18] Short Utterance Speaker Recognition Based on Speech High Frequency Information Compensation and Dynamic Feature Enhancement Methods
Zi, Yunfei
Xiong, Shengwu
ARCHIVES OF ACOUSTICS, 2024, 49 (01) : 37 - 48
[19] MFA: TDNN WITH MULTI-SCALE FREQUENCY-CHANNEL ATTENTION FOR TEXT-INDEPENDENT SPEAKER VERIFICATION WITH SHORT UTTERANCES
Liu, Tianchi
Das, Rohan Kumar
Lee, Kong Aik
Li, Haizhou
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7517 - 7521
[20] An Adaptive i-Vector Extraction for Speaker Verification with Short Utterance
Poddar, Arnab
Sahidullah, Md
Saha, Goutam
PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2017, 2017, 10597 : 326 - 332

← 1 2 3 4 5 →