HUMAN AND MACHINE SPEAKER RECOGNITION BASED ON SHORT TRIVIAL EVENTS

被引：0

作者：

Zhang, Miao ^{[1
,2
]}

Kang, Xiaofei ^{[1
,3
]}

Wang, Yanqing ^{[1
,2
]}

Li, Lantian ^{[1
]}

Tang, Zhiyuan ^{[1
]}

Dai, Haisheng ^{[4
]}

Wang, Dong ^{[1
]}

机构：

[1] Tsinghua Univ, Ctr Speech & Language Technol, Beijing, Peoples R China

[2] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

[3] Peking Univ, Beijing, Peoples R China

[4] JD AI Res, Beijing, Peoples R China

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年

基金：

中国国家自然科学基金;

关键词：

speaker recognition; speech perception; deep neural network; speaker feature learning;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Human speech often has events that we will call trivial events, e.g., cough, laugh and sniff. Compared to regular speech, these trivial events are usually short and variable, thus generally regarded as not speaker discriminative and so are largely ignored by present speaker recognition research. However, these trivial events are highly valuable in some particular circumstances such as forensic examination, as they are less subjected to intentional change, so can be used to discover the genuine speaker from disguised speech. In this paper, we collect a trivial event speech database that involves 75 speakers and 6 types of events, and report preliminary speaker recognition results on this database, by both human listeners and machines. Particularly, the deep feature learning technique recently proposed by our group is utilized to analyze and recognize the trivial events, leading to acceptable equal error rates (EERs) ranging from 5% to 15% despite the extremely short durations (0.2-0.5 seconds) of these events. Comparing different types of events, 'hmm' seems more speaker discriminative.

引用

页码：5009 / 5013

页数：5

共 50 条

[21] Speaker recognition based on the combination of GMM and SVDD
Zhou, Yuhuan
Zhang, Xiongwei
Wang, Jinming
Gong, Yong
Zhou, Yi
PRZEGLAD ELEKTROTECHNICZNY, 2011, 87 (03): : 329 - 332
[22] Robust Speaker Recognition Based on Improved GFCC
Shi, Xiaoyuan
Yang, Haiyan
Zhou, Ping
2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 1927 - 1931
[23] Speaker recognition based on deep learning: An overview
Bai, Zhongxin
Zhang, Xiao-Lei
NEURAL NETWORKS, 2021, 140 : 65 - 99
[24] STACKED AUTOENCODER NETWORKS BASED SPEAKER RECOGNITION
Zeng, Chun-Yan
Ma, Chao-Feng
Wang, Zhi-Feng
Ye, Jia-Xiang
PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 1, 2018, : 294 - 299
[25] Speaker Recognition Based on Variational Bayesian Method
Ito, Tatsuya
Hashimoto, Kei
Nankaku, Yoshihiko
Lee, Akinobu
Tokuda, Keiichi
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1417 - 1420
[26] High-Level CNN and Machine Learning Methods for Speaker Recognition
Costantini, Giovanni
Cesarini, Valerio
Brenna, Emanuele
SENSORS, 2023, 23 (07)
[27] Lattice-based MLLR for Speaker Recognition
Ferras, Marc
Barras, Claude
Gauvain, Jean-Luc
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4537 - 4540
[28] Speaker Recognition Based on Quantum Neural Network
Wang, Geng
Wang, Jin Ming
Sun, Jian
2ND INTERNATIONAL SYMPOSIUM ON COMPUTER NETWORK AND MULTIMEDIA TECHNOLOGY (CNMT 2010), VOLS 1 AND 2, 2010, : 238 - 241
[29] Speaker Recognition Based on Dynamic MFCC Parameters
Wang Yutai
Li Bo
Jiang Xiaoqing
Liu Feng
Wang Lihao
PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON IMAGE ANALYSIS AND SIGNAL PROCESSING, 2009, : 406 - 409
[30] Research on MLLR based speaker recognition algorithm
Tsinghua National Laboratory for Information Science and Technology , Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
Zidonghua Xuebao Acta Auto. Sin., 2009, 5 (546-550): : 546 - 550

← 1 2 3 4 5 →