HUMAN AND MACHINE SPEAKER RECOGNITION BASED ON SHORT TRIVIAL EVENTS

被引：0

作者：

Zhang, Miao ^{[1
,2
]}

Kang, Xiaofei ^{[1
,3
]}

Wang, Yanqing ^{[1
,2
]}

Li, Lantian ^{[1
]}

Tang, Zhiyuan ^{[1
]}

Dai, Haisheng ^{[4
]}

Wang, Dong ^{[1
]}

机构：

[1] Tsinghua Univ, Ctr Speech & Language Technol, Beijing, Peoples R China

[2] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

[3] Peking Univ, Beijing, Peoples R China

[4] JD AI Res, Beijing, Peoples R China

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年

基金：

中国国家自然科学基金;

关键词：

speaker recognition; speech perception; deep neural network; speaker feature learning;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Human speech often has events that we will call trivial events, e.g., cough, laugh and sniff. Compared to regular speech, these trivial events are usually short and variable, thus generally regarded as not speaker discriminative and so are largely ignored by present speaker recognition research. However, these trivial events are highly valuable in some particular circumstances such as forensic examination, as they are less subjected to intentional change, so can be used to discover the genuine speaker from disguised speech. In this paper, we collect a trivial event speech database that involves 75 speakers and 6 types of events, and report preliminary speaker recognition results on this database, by both human listeners and machines. Particularly, the deep feature learning technique recently proposed by our group is utilized to analyze and recognize the trivial events, leading to acceptable equal error rates (EERs) ranging from 5% to 15% despite the extremely short durations (0.2-0.5 seconds) of these events. Comparing different types of events, 'hmm' seems more speaker discriminative.

引用

页码：5009 / 5013

页数：5

共 50 条

[31] A Speaker Recognition Algorithm Based on Factor Analysis
Shen, Xuanjing
Zhai, Yujie
Wang, Yu
Chen, Haipeng
2014 7TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP 2014), 2014, : 897 - 901
[32] Speaker recognition algorithm based on channel compensation
Shen X.-J.
Zhai Y.-J.
Lu Y.-T.
Wang Y.
Chen H.-P.
Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2016, 46 (03): : 870 - 875
[33] Joint short-time speaker recognition and tracking using sparsity-based source detection
Guo, Yao
Zhu, Hongyan
ACTA ACUSTICA, 2023, 7
[34] Ensemble of Support Vector Machine for Text-Independent Speaker Recognition
Lei, Zhenchun
Yang, Yingchun
Wu, Zhaohui
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2006, 6 (5A): : 163 - 167
[35] Robust features for text-independent speaker recognition with short utterances
Chakroun, Rania
Frikha, Mondher
NEURAL COMPUTING & APPLICATIONS, 2020, 32 (17) : 13863 - 13883
[36] Improving Short Utterance Speaker Recognition by Modeling Speech Unit Classes
Li, Lantian
Wang, Dong
Zhang, Chenhao
Zheng, Thomas Fang
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (06) : 1129 - 1139
[37] A short utterance speaker recognition method with improved cepstrum-CNN
Li, Yongfeng
Chang, Shuaishuai
Wu, QingE
SN APPLIED SCIENCES, 2022, 4 (12):
[38] SHORT UTTERANCE SPEAKER RECOGNITION BY RESERVOIR WITH SELF-ORGANIZED MAPPING
Ikeda, Narumitsu
Sato, Yoshinao
Takahashi, Hirokazu
2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 1073 - 1077
[39] Robust features for text-independent speaker recognition with short utterances
Rania Chakroun
Mondher Frikha
Neural Computing and Applications, 2020, 32 : 13863 - 13883
[40] Sinc-attention feature extraction for trivial-event based speaker verification
Li, Lin
Li, Jun
Wang, Dingyi
Wang, Xiaoqin
Qiao, Shushan
ELECTRONICS LETTERS, 2023, 59 (09)

← 1 2 3 4 5 →