Text-independent speaker identification based on deep Gaussian correlation supervector

被引：0

作者：

Linhui Sun

Ting Gu

Keli Xie

Jia Chen

机构：

[1] Nanjing University of Posts and Telecommunications,College of Telecommunications & Information Engineering

[2] Ministry of Education,Key Lab of Broadband Wireless Communication and Sensor Network Technology

[3] Nanjing University of Posts and Telecommunications,undefined

来源：

International Journal of Speech Technology | 2019年 / 22卷

关键词：

Gaussian mixture model; Deep neural network; Speaker identification; Bottleneck feature; Deep Gaussian correlation supervector;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Great progress has been made in speaker recognition by extracting features from Gaussian mixture model (GMM) or deep neural network (DNN). In this paper, to extract the personality characteristics of speakers more accurately, we propose a novel deep Gaussian correlation supervector (DGCS) feature based on a DBN-GMM hybrid model. In the method, we firstly extract MFCC from preprocessed speech signals and employ a DBN to gain bottleneck features. Then bottleneck features are fed to a GMM to extract deep Gaussian supervector (DGS) which can be as the input of SVM achieving pattern discrimination and judgment. Further considering the relevance between deep mean vectors of DGS, DGS will be transformed to DGCS by the method of supervector recombination. Our experiments show that utilizing DGCS can significantly improve recognition rate by 17.979% compared to the system only with supervector, 18.22% compared to the system with DGS and 1.875% compared to the system with correlation supervector. In addition, the proposed DGCS demonstrates that time complexity for identification task can be largely reduced.

引用

页码：449 / 457

页数：8

共 50 条

[31] An investigation of dependencies between frequency components and speaker characteristics for text-independent speaker identification
Lu, Xugang
Dang, Jianwu
SPEECH COMMUNICATION, 2008, 50 (04) : 312 - 322
[32] Improving the Generalized Performance of Deep Embedding for Text-Independent Speaker Verification
Li, Rongjin
Li, Lin
Hong, Qingyang
Guo, Huiyang
Zhao, Miao
PROCEEDINGS OF 2018 12TH IEEE INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (ASID), 2018, : 21 - 25
[33] Text-independent speaker identification in environment using singular value decomposition
Aldhaheri, RW
Al-Saadi, FE
ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1624 - 1628
[34] Text-Independent Speaker Identification in Emotional Environments: A Classifier Fusion Approach
Jawarkar, N. P.
Holambe, R. S.
Basu, T. K.
FRONTIERS IN COMPUTER EDUCATION, 2012, 133 : 569 - +
[35] Self-Organizing Mixture Models for Text-Independent Speaker Identification
Bouziane, Ayoub
Kharroubi, Jamal
Zarghili, Arsalane
2014 THIRD IEEE INTERNATIONAL COLLOQUIUM IN INFORMATION SCIENCE AND TECHNOLOGY (CIST'14), 2014, : 345 - 350
[36] Text-Independent Speaker Identification Using Formants and Convolutional Neural Networks
Camarena-Ibarrola, Antonio
Reynoso, Miguel
Figueroa, Karina
ADVANCES IN SOFT COMPUTING (MICAI 2021), PT II, 2021, 13068 : 108 - 119
[37] Binary quantization of feature vectors for robust text-independent speaker identification
Yuan, ZX
Xu, BL
Yu, CZ
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (01): : 70 - 78
[38] A Text-Independent Speaker Verification System Based on Cross Entropy
Lu, Xiaochun
Yin, Junxun
COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS, 2009, 51 : 419 - 426
[39] A TEXT-INDEPENDENT SPEAKER RECOGNITION SYSTEM BASED ON VOWEL SPOTTING
FAKOTAKIS, N
TSOPANOGLOU, A
KOKKINAKIS, G
SPEECH COMMUNICATION, 1993, 12 (01) : 57 - 68
[40] An Improved Approach for Text-Independent Speaker Recognition
Chakroun, Rania
Zouari, Leila Beltaifa
Frikha, Mondher
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (08) : 343 - 348

← 1 2 3 4 5 →