Multi-Scale Kernels for Short Utterance Speaker Recognition

被引：0

作者：

Zhang, Wei-Qiang ^{[1
]}

Zhao, Junhong ^{[2
,3
]}

Zhang, Wen-Lin ^{[4
]}

Liu, Jia ^{[1
]}

机构：

[1] Tsinghua Univ, Tsinghua Natl Lab Informat Sci & Technol, Dept Elect Engn, Beijing 100084, Peoples R China

[2] Chinese Acad Sci, Inst Elect, State Key Lab Transducer Technol, Beijing 100190, Peoples R China

[3] Univ Chinese Acad Sci, Beijing 100190, Peoples R China

[4] Zhengzhou Informat Sci & Technol Inst, Zhengzhou 450002, Peoples R China

来源：

2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2014年

基金：

中国国家自然科学基金;

关键词：

speaker recognition; short utterance; multi-scale kernel;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Short utterance is a great challenge for speaker recognition, for there is very limited data can be used for training and testing. To give a robust estimation, the amount of model parameters for the short utterance should be less than that for the long utterance; however, this may impede the models descriptive capability. In this paper, we propose a multi-scale kernel (MSK) approach to solve this problem. We construct a series of kernels with different scales, and combine them through multiple kernel learning (MKL) optimization. In this way, the robustness and scalability of the model will be both enhanced. The experimental results on NIST SRE 2010 10sec-10sec dataset show that the proposed MSK method outperforms the traditional Gaussian mixture model supervector (GSV) followed by support vector machine (SVM) method.

引用

页码：414 / +

页数：2

共 50 条

[41] Using Voice Quality Features to Improve Short-Utterance, Text-Independent Speaker Verification Systems
Park, Soo Jin
Yeung, Gary
Kreiman, Jody
Keating, Patricia A.
Alwan, Abeer
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1522 - 1526
[42] HUMAN AND MACHINE SPEAKER RECOGNITION BASED ON SHORT TRIVIAL EVENTS
Zhang, Miao
Kang, Xiaofei
Wang, Yanqing
Li, Lantian
Tang, Zhiyuan
Dai, Haisheng
Wang, Dong
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5009 - 5013
[43] Probabilistic approach using joint long and short session i-vectors modeling to deal with short utterances for speaker recognition
Ben Kheder, Waad
Matrouf, Driss
Ajili, Moez
Bonastre, Jean-Francois
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1830 - 1834
[44] A Study of X-vector Based Speaker Recognition on Short Utterances
Kanagasundaram, A.
Sridharan, S.
Sriram, G.
Prachi, S.
Fookes, C.
INTERSPEECH 2019, 2019, : 2943 - 2947
[45] CN-Celeb: Multi-genre speaker recognition
Li, Lantian
Liu, Ruiqi
Kang, Jiawen
Fan, Yue
Cui, Hao
Cai, Yunqi
Vipperla, Ravichander
Zheng, Thomas Fang
Wang, Dong
SPEECH COMMUNICATION, 2022, 137 : 77 - 91
[46] A multi-class MLLR kernel for SVM speaker recognition
Karam, Zahi N.
Campbell, William M.
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4117 - +
[47] Moving average multi directional local features for speaker recognition
Mahmood, Awais
Muhammad, Ghulam
Alsulaiman, Mansour
Dhahri, Habib
Othman, Esam M. Asem
Faisal, Mohammed
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 1): : 2145 - 2157
[48] Biomimetic multi-resolution analysis for robust speaker recognition
Sridhar Krishna Nemala
Dmitry N Zotkin
Ramani Duraiswami
Mounya Elhilali
EURASIP Journal on Audio, Speech, and Music Processing, 2012
[49] Towards multi-task learning of speech and speaker recognition
Vaessen, Nik
van Leeuwen, David A.
INTERSPEECH 2023, 2023, : 4898 - 4902
[50] Robust features for text-independent speaker recognition with short utterances
Chakroun, Rania
Frikha, Mondher
NEURAL COMPUTING & APPLICATIONS, 2020, 32 (17) : 13863 - 13883

← 1 2 3 4 5 →