Text-independent speaker identification based on deep Gaussian correlation supervector

被引:0
|
作者
Linhui Sun
Ting Gu
Keli Xie
Jia Chen
机构
[1] Nanjing University of Posts and Telecommunications,College of Telecommunications & Information Engineering
[2] Ministry of Education,Key Lab of Broadband Wireless Communication and Sensor Network Technology
[3] Nanjing University of Posts and Telecommunications,undefined
来源
International Journal of Speech Technology | 2019年 / 22卷
关键词
Gaussian mixture model; Deep neural network; Speaker identification; Bottleneck feature; Deep Gaussian correlation supervector;
D O I
暂无
中图分类号
学科分类号
摘要
Great progress has been made in speaker recognition by extracting features from Gaussian mixture model (GMM) or deep neural network (DNN). In this paper, to extract the personality characteristics of speakers more accurately, we propose a novel deep Gaussian correlation supervector (DGCS) feature based on a DBN-GMM hybrid model. In the method, we firstly extract MFCC from preprocessed speech signals and employ a DBN to gain bottleneck features. Then bottleneck features are fed to a GMM to extract deep Gaussian supervector (DGS) which can be as the input of SVM achieving pattern discrimination and judgment. Further considering the relevance between deep mean vectors of DGS, DGS will be transformed to DGCS by the method of supervector recombination. Our experiments show that utilizing DGCS can significantly improve recognition rate by 17.979% compared to the system only with supervector, 18.22% compared to the system with DGS and 1.875% compared to the system with correlation supervector. In addition, the proposed DGCS demonstrates that time complexity for identification task can be largely reduced.
引用
收藏
页码:449 / 457
页数:8
相关论文
共 50 条
  • [31] An investigation of dependencies between frequency components and speaker characteristics for text-independent speaker identification
    Lu, Xugang
    Dang, Jianwu
    SPEECH COMMUNICATION, 2008, 50 (04) : 312 - 322
  • [32] Improving the Generalized Performance of Deep Embedding for Text-Independent Speaker Verification
    Li, Rongjin
    Li, Lin
    Hong, Qingyang
    Guo, Huiyang
    Zhao, Miao
    PROCEEDINGS OF 2018 12TH IEEE INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (ASID), 2018, : 21 - 25
  • [33] Text-independent speaker identification in environment using singular value decomposition
    Aldhaheri, RW
    Al-Saadi, FE
    ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1624 - 1628
  • [34] Text-Independent Speaker Identification in Emotional Environments: A Classifier Fusion Approach
    Jawarkar, N. P.
    Holambe, R. S.
    Basu, T. K.
    FRONTIERS IN COMPUTER EDUCATION, 2012, 133 : 569 - +
  • [35] Self-Organizing Mixture Models for Text-Independent Speaker Identification
    Bouziane, Ayoub
    Kharroubi, Jamal
    Zarghili, Arsalane
    2014 THIRD IEEE INTERNATIONAL COLLOQUIUM IN INFORMATION SCIENCE AND TECHNOLOGY (CIST'14), 2014, : 345 - 350
  • [36] Text-Independent Speaker Identification Using Formants and Convolutional Neural Networks
    Camarena-Ibarrola, Antonio
    Reynoso, Miguel
    Figueroa, Karina
    ADVANCES IN SOFT COMPUTING (MICAI 2021), PT II, 2021, 13068 : 108 - 119
  • [37] Binary quantization of feature vectors for robust text-independent speaker identification
    Yuan, ZX
    Xu, BL
    Yu, CZ
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (01): : 70 - 78
  • [38] A Text-Independent Speaker Verification System Based on Cross Entropy
    Lu, Xiaochun
    Yin, Junxun
    COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS, 2009, 51 : 419 - 426
  • [39] A TEXT-INDEPENDENT SPEAKER RECOGNITION SYSTEM BASED ON VOWEL SPOTTING
    FAKOTAKIS, N
    TSOPANOGLOU, A
    KOKKINAKIS, G
    SPEECH COMMUNICATION, 1993, 12 (01) : 57 - 68
  • [40] An Improved Approach for Text-Independent Speaker Recognition
    Chakroun, Rania
    Zouari, Leila Beltaifa
    Frikha, Mondher
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (08) : 343 - 348