Text-independent speaker identification based on deep Gaussian correlation supervector

被引:0
|
作者
Linhui Sun
Ting Gu
Keli Xie
Jia Chen
机构
[1] Nanjing University of Posts and Telecommunications,College of Telecommunications & Information Engineering
[2] Ministry of Education,Key Lab of Broadband Wireless Communication and Sensor Network Technology
[3] Nanjing University of Posts and Telecommunications,undefined
来源
International Journal of Speech Technology | 2019年 / 22卷
关键词
Gaussian mixture model; Deep neural network; Speaker identification; Bottleneck feature; Deep Gaussian correlation supervector;
D O I
暂无
中图分类号
学科分类号
摘要
Great progress has been made in speaker recognition by extracting features from Gaussian mixture model (GMM) or deep neural network (DNN). In this paper, to extract the personality characteristics of speakers more accurately, we propose a novel deep Gaussian correlation supervector (DGCS) feature based on a DBN-GMM hybrid model. In the method, we firstly extract MFCC from preprocessed speech signals and employ a DBN to gain bottleneck features. Then bottleneck features are fed to a GMM to extract deep Gaussian supervector (DGS) which can be as the input of SVM achieving pattern discrimination and judgment. Further considering the relevance between deep mean vectors of DGS, DGS will be transformed to DGCS by the method of supervector recombination. Our experiments show that utilizing DGCS can significantly improve recognition rate by 17.979% compared to the system only with supervector, 18.22% compared to the system with DGS and 1.875% compared to the system with correlation supervector. In addition, the proposed DGCS demonstrates that time complexity for identification task can be largely reduced.
引用
收藏
页码:449 / 457
页数:8
相关论文
共 50 条
  • [1] Text-independent speaker identification based on deep Gaussian correlation supervector
    Sun, Linhui
    Gu, Ting
    Xie, Keli
    Chen, Jia
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (02) : 449 - 457
  • [2] A novel text-independent speaker identification method based on common Gaussian bases
    Hao, Chen
    Zhao, Rongchun
    2005 INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND TECHNOLOGY, PROCEEDINGS, 2005, : 72 - 78
  • [3] Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models
    Chakroun, Rania
    Frikha, Mondher
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT II, 2019, 11776 : 3 - 10
  • [4] Text-independent speaker identification using Gaussian mixture models based on multi-space probability distribution
    Miyajima, C
    Hattori, Y
    Tokuda, K
    Masuko, T
    Kobayashi, T
    Kitamura, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2001, E84D (07): : 847 - 855
  • [5] On the use of Classifiers for Text-independent Speaker Identification
    Jawarkar, Naresh P.
    Holambe, Raghunath S.
    Basu, Tapan Kumar
    2014 FIRST INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL, ENERGY & SYSTEMS (ACES-14), 2014, : 238 - 242
  • [6] An MFCC-based text-independent speaker identification system for access control
    Liu, Jung-Chun
    Leu, Fang-Yie
    Lin, Guan-Liang
    Susanto, Heru
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (02)
  • [7] Text-Independent Speaker Identification Using the Histogram Transform Model
    Ma, Zhanyu
    Yu, Hong
    Tan, Zheng-Hua
    Guo, Jun
    IEEE ACCESS, 2016, 4 : 9733 - 9739
  • [8] Text-Independent Speaker Identification Through Feature Fusion and Deep Neural Network
    Jahangir, Rashid
    TEh, Ying Wah
    Memon, Nisar Ahmed
    Mujtaba, Ghulam
    Zareei, Mahdi
    Ishtiaq, Uzair
    Akhtar, Muhammad Zaheer
    Ali, Ihsan
    IEEE ACCESS, 2020, 8 : 32187 - 32202
  • [9] Fused Mel Feature sets based Text-Independent Speaker Identification using Gaussian Mixture Model
    Kumari, R. Shantha Selva
    Nidhyananthan, S. Selva
    Anand, G.
    INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY AND SYSTEM DESIGN 2011, 2012, 30 : 319 - 326
  • [10] A New Set of Features for Text-Independent Speaker Identification
    Espy-Wilson, Carol Y.
    Manocha, Sandeep
    Vishnubhotla, Srikanth
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1475 - +