Text-independent speaker identification based on deep Gaussian correlation supervector

被引：0

作者：

Linhui Sun

Ting Gu

Keli Xie

Jia Chen

机构：

[1] Nanjing University of Posts and Telecommunications,College of Telecommunications & Information Engineering

[2] Ministry of Education,Key Lab of Broadband Wireless Communication and Sensor Network Technology

[3] Nanjing University of Posts and Telecommunications,undefined

来源：

International Journal of Speech Technology | 2019年 / 22卷

关键词：

Gaussian mixture model; Deep neural network; Speaker identification; Bottleneck feature; Deep Gaussian correlation supervector;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Great progress has been made in speaker recognition by extracting features from Gaussian mixture model (GMM) or deep neural network (DNN). In this paper, to extract the personality characteristics of speakers more accurately, we propose a novel deep Gaussian correlation supervector (DGCS) feature based on a DBN-GMM hybrid model. In the method, we firstly extract MFCC from preprocessed speech signals and employ a DBN to gain bottleneck features. Then bottleneck features are fed to a GMM to extract deep Gaussian supervector (DGS) which can be as the input of SVM achieving pattern discrimination and judgment. Further considering the relevance between deep mean vectors of DGS, DGS will be transformed to DGCS by the method of supervector recombination. Our experiments show that utilizing DGCS can significantly improve recognition rate by 17.979% compared to the system only with supervector, 18.22% compared to the system with DGS and 1.875% compared to the system with correlation supervector. In addition, the proposed DGCS demonstrates that time complexity for identification task can be largely reduced.

引用

页码：449 / 457

页数：8

共 50 条

[1] Text-independent speaker identification based on deep Gaussian correlation supervector
Sun, Linhui
Gu, Ting
Xie, Keli
Chen, Jia
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (02) : 449 - 457
[2] A novel text-independent speaker identification method based on common Gaussian bases
Hao, Chen
Zhao, Rongchun
2005 INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND TECHNOLOGY, PROCEEDINGS, 2005, : 72 - 78
[3] Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models
Chakroun, Rania
Frikha, Mondher
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT II, 2019, 11776 : 3 - 10
[4] Text-independent speaker identification using Gaussian mixture models based on multi-space probability distribution
Miyajima, C
Hattori, Y
Tokuda, K
Masuko, T
Kobayashi, T
Kitamura, T
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2001, E84D (07): : 847 - 855
[5] On the use of Classifiers for Text-independent Speaker Identification
Jawarkar, Naresh P.
Holambe, Raghunath S.
Basu, Tapan Kumar
2014 FIRST INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL, ENERGY & SYSTEMS (ACES-14), 2014, : 238 - 242
[6] An MFCC-based text-independent speaker identification system for access control
Liu, Jung-Chun
Leu, Fang-Yie
Lin, Guan-Liang
Susanto, Heru
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (02)
[7] Text-Independent Speaker Identification Using the Histogram Transform Model
Ma, Zhanyu
Yu, Hong
Tan, Zheng-Hua
Guo, Jun
IEEE ACCESS, 2016, 4 : 9733 - 9739
[8] Text-Independent Speaker Identification Through Feature Fusion and Deep Neural Network
Jahangir, Rashid
TEh, Ying Wah
Memon, Nisar Ahmed
Mujtaba, Ghulam
Zareei, Mahdi
Ishtiaq, Uzair
Akhtar, Muhammad Zaheer
Ali, Ihsan
IEEE ACCESS, 2020, 8 : 32187 - 32202
[9] Fused Mel Feature sets based Text-Independent Speaker Identification using Gaussian Mixture Model
Kumari, R. Shantha Selva
Nidhyananthan, S. Selva
Anand, G.
INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY AND SYSTEM DESIGN 2011, 2012, 30 : 319 - 326
[10] A New Set of Features for Text-Independent Speaker Identification
Espy-Wilson, Carol Y.
Manocha, Sandeep
Vishnubhotla, Srikanth
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1475 - +

← 1 2 3 4 5 →