Multi-Scale Kernels for Short Utterance Speaker Recognition

被引：0

作者：

Zhang, Wei-Qiang ^{[1
]}

Zhao, Junhong ^{[2
,3
]}

Zhang, Wen-Lin ^{[4
]}

Liu, Jia ^{[1
]}

机构：

[1] Tsinghua Univ, Tsinghua Natl Lab Informat Sci & Technol, Dept Elect Engn, Beijing 100084, Peoples R China

[2] Chinese Acad Sci, Inst Elect, State Key Lab Transducer Technol, Beijing 100190, Peoples R China

[3] Univ Chinese Acad Sci, Beijing 100190, Peoples R China

[4] Zhengzhou Informat Sci & Technol Inst, Zhengzhou 450002, Peoples R China

来源：

2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2014年

基金：

中国国家自然科学基金;

关键词：

speaker recognition; short utterance; multi-scale kernel;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Short utterance is a great challenge for speaker recognition, for there is very limited data can be used for training and testing. To give a robust estimation, the amount of model parameters for the short utterance should be less than that for the long utterance; however, this may impede the models descriptive capability. In this paper, we propose a multi-scale kernel (MSK) approach to solve this problem. We construct a series of kernels with different scales, and combine them through multiple kernel learning (MKL) optimization. In this way, the robustness and scalability of the model will be both enhanced. The experimental results on NIST SRE 2010 10sec-10sec dataset show that the proposed MSK method outperforms the traditional Gaussian mixture model supervector (GSV) followed by support vector machine (SVM) method.

引用

页码：414 / +

页数：2

共 50 条

[1] A short utterance speaker recognition method with improved cepstrum–CNN
Yongfeng Li
Shuaishuai Chang
QingE Wu
SN Applied Sciences, 2022, 4
[2] Multi-resolution time frequency feature and complementary combination for short utterance speaker recognition
Zhi-Yi Li
Wei-Qiang Zhang
Jia Liu
Multimedia Tools and Applications, 2015, 74 : 937 - 953
[3] Multi-resolution time frequency feature and complementary combination for short utterance speaker recognition
Li, Zhi-Yi
Zhang, Wei-Qiang
Liu, Jia
MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (03) : 937 - 953
[4] A short utterance speaker recognition method with improved cepstrum-CNN
Li, Yongfeng
Chang, Shuaishuai
Wu, QingE
SN APPLIED SCIENCES, 2022, 4 (12):
[5] Improving Short Utterance Speaker Recognition by Modeling Speech Unit Classes
Li, Lantian
Wang, Dong
Zhang, Chenhao
Zheng, Thomas Fang
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (06) : 1129 - 1139
[6] ASTT: acoustic spatial-temporal transformer for short utterance speaker recognition
Wu, Xing
Li, Ruixuan
Deng, Bin
Zhao, Ming
Du, Xingyue
Wang, Jianjia
Ding, Kai
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (21) : 33039 - 33061
[7] ASTT: acoustic spatial-temporal transformer for short utterance speaker recognition
Xing Wu
Ruixuan Li
Bin Deng
Ming Zhao
Xingyue Du
Jianjia Wang
Kai Ding
Multimedia Tools and Applications, 2023, 82 : 33039 - 33061
[8] SHORT UTTERANCE SPEAKER RECOGNITION BY RESERVOIR WITH SELF-ORGANIZED MAPPING
Ikeda, Narumitsu
Sato, Yoshinao
Takahashi, Hirokazu
2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 1073 - 1077
[9] Speaker recognition based on short utterance compensation method of generative adversarial networks
Hu, Zhangfang
Fu, Yaqin
Luo, Yuan
Xu, Xuan
Xia, Zhiguang
Zhang, Hongwei
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (02) : 443 - 450
[10] Speaker recognition based on short utterance compensation method of generative adversarial networks
Zhangfang Hu
Yaqin Fu
Yuan Luo
Xuan Xu
Zhiguang Xia
Hongwei Zhang
International Journal of Speech Technology, 2020, 23 : 443 - 450

← 1 2 3 4 5 →